public class SpeechMarker extends BaseDataProcessor
The algorithm for inserting the two signals is as follows.
The algorithm is always in one of two states: 'in-speech' and 'out-of-speech'. If 'out-of-speech', it will read in audio until we hit audio that is speech. If we have read more than 'startSpeech' amount of continuous speech, we consider that speech has started, and insert a SPEECH_START at 'speechLeader' time before speech first started. The state of the algorithm changes to 'in-speech'.
Now consider the case when the algorithm is in 'in-speech' state. If it read an audio that is speech, it is scheduled for output. If the audio is non-speech, we read ahead until we have 'endSilence' amount of continuous non-speech. At the point we consider that speech has ended. A SPEECH_END signal is inserted at 'speechTrailer' time after the first non-speech audio. The algorithm returns to 'out-of-speech' state. If any speech audio is encountered in-between, the accounting starts all over again. While speech audio is processed delay is lowered to some minimal amount. This helps to segment both slow speech with visible delays and fast speech when delays are minimal.
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
PROP_END_SILENCE
The property for the amount of time in silence (in milliseconds) to be
considered as utterance end.
|
static java.lang.String |
PROP_SPEECH_LEADER
The property for the amount of time (in milliseconds) before speech start
to be included as speech data.
|
static java.lang.String |
PROP_START_SPEECH
The property for the minimum amount of time in speech (in milliseconds)
to be considered as utterance start.
|
logger
Constructor and Description |
---|
SpeechMarker() |
SpeechMarker(int startSpeechTime,
int endSilenceTime,
int speechLeader) |
Modifier and Type | Method and Description |
---|---|
Data |
getData()
Returns the next Data object.
|
void |
initialize()
Initializes this SpeechMarker
|
boolean |
inSpeech() |
void |
newProperties(PropertySheet ps)
This method is called when this configurable component needs to be reconfigured.
|
getPredecessor, setPredecessor
getName, initLogger, toString
@S4Integer(defaultValue=200) public static final java.lang.String PROP_START_SPEECH
@S4Integer(defaultValue=200) public static final java.lang.String PROP_END_SILENCE
@S4Integer(defaultValue=50) public static final java.lang.String PROP_SPEECH_LEADER
public SpeechMarker(int startSpeechTime, int endSilenceTime, int speechLeader)
public SpeechMarker()
public void newProperties(PropertySheet ps) throws PropertyException
Configurable
newProperties
in interface Configurable
newProperties
in class ConfigurableAdapter
ps
- a property sheet holding the new dataPropertyException
- if there is a problem with the properties.public void initialize()
initialize
in interface DataProcessor
initialize
in class BaseDataProcessor
public Data getData() throws DataProcessingException
getData
in interface DataProcessor
getData
in class BaseDataProcessor
DataProcessingException
- if a data processing error occurspublic boolean inSpeech()