public class SpeechMarker extends BaseDataProcessor
The algorithm for inserting the two signals is as follows.
The algorithm is always in one of two states: 'in-speech' and 'out-of-speech'. If 'out-of-speech', it will read in audio until we hit audio that is speech. If we have read more than 'startSpeech' amount of continuous speech, we consider that speech has started, and insert a SPEECH_START at 'speechLeader' time before speech first started. The state of the algorithm changes to 'in-speech'.
Now consider the case when the algorithm is in 'in-speech' state. If it read an audio that is speech, it is scheduled for output. If the audio is non-speech, we read ahead until we have 'endSilence' amount of continuous non-speech. At the point we consider that speech has ended. A SPEECH_END signal is inserted at 'speechTrailer' time after the first non-speech audio. The algorithm returns to 'out-of-speech' state. If any speech audio is encountered in-between, the accounting starts all over again. While speech audio is processed delay is lowered to some minimal amount. This helps to segment both slow speech with visible delays and fast speech when delays are minimal.
| Modifier and Type | Field and Description |
|---|---|
static java.lang.String |
PROP_END_SILENCE
The property for the amount of time in silence (in milliseconds) to be
considered as utterance end.
|
static java.lang.String |
PROP_SPEECH_LEADER
The property for the amount of time (in milliseconds) before speech start
to be included as speech data.
|
static java.lang.String |
PROP_START_SPEECH
The property for the minimum amount of time in speech (in milliseconds)
to be considered as utterance start.
|
logger| Constructor and Description |
|---|
SpeechMarker() |
SpeechMarker(int startSpeechTime,
int endSilenceTime,
int speechLeader) |
| Modifier and Type | Method and Description |
|---|---|
Data |
getData()
Returns the next Data object.
|
void |
initialize()
Initializes this SpeechMarker
|
boolean |
inSpeech() |
void |
newProperties(PropertySheet ps)
This method is called when this configurable component needs to be reconfigured.
|
getPredecessor, setPredecessorgetName, initLogger, toString@S4Integer(defaultValue=200) public static final java.lang.String PROP_START_SPEECH
@S4Integer(defaultValue=200) public static final java.lang.String PROP_END_SILENCE
@S4Integer(defaultValue=50) public static final java.lang.String PROP_SPEECH_LEADER
public SpeechMarker(int startSpeechTime,
int endSilenceTime,
int speechLeader)
public SpeechMarker()
public void newProperties(PropertySheet ps) throws PropertyException
ConfigurablenewProperties in interface ConfigurablenewProperties in class ConfigurableAdapterps - a property sheet holding the new dataPropertyException - if there is a problem with the properties.public void initialize()
initialize in interface DataProcessorinitialize in class BaseDataProcessorpublic Data getData() throws DataProcessingException
getData in interface DataProcessorgetData in class BaseDataProcessorDataProcessingException - if a data processing error occurspublic boolean inSpeech()