public class LexTreeLinguist extends java.lang.Object implements Linguist
getInitialSearchState
. This method returns a SearchState.
Successor states can be retrieved via calls to SearchState.getSuccessors().
. There are a number of
search state sub-interfaces that are used to indicate different types of states in the search space:
getSearchStateOrder
can be used to retrieve the order of state returned by the linguist.
Depending on the vocabulary size and topology, the search space represented by the linguist may include a very large
number of states. Some linguists will generate the search states dynamically, that is, the object representing a
particular state in the search space is not created until it is needed by the SearchManager. SearchManagers often
need to be able to determine if a particular state has been entered before by comparing states. Because SearchStates
may be generated dynamically, the SearchState.equals()
call (as opposed to the reference equals '=='
method) should be used to determine if states are equal. The states returned by the linguist will generally provide
very efficient implementations of equals
and hashCode
. This will allow a SearchManager to
maintain collections of states in HashMaps efficiently.
LexTeeLinguist Characteristics
Some characteristics of this linguist:
This linguist is not a general purpose linguist. It does impose some constraints:
Design Notes The following are some notes describing the design of this linguist. They may be helpful to those who want to understand how this linguist works but are not necessary if you are only interested in using this linguist.
Search Space Representation It has been shown that representing the search space as a tree can greatly reduce the number of active states in a search since the units at the beginnings of words can be shared across multiple words. For example, with a large vocabulary (60K words), at the end of a word, with a flat representation, we have to provide transitions to the initial state of each possible word. That is 60K transitions. In a tree based system we need to only provide transitions to each initial phone (within its context). That is about 1600 transitions. This is a substantial reduction. Conceptually, this tree consists of a node for each possible initial unit. Each node can have an arbitrary number of children which can be either unit nodes or word nodes.
This linguist uses the HMMTree class to build and represent the tree. The HMMTree is given the dictionary and language model and builds the lextree. Instead of representing the nodes in the tree as phonemes and words as is typically done, the HMMTree represents the tree as HMMs and words. The HMM is essentially a unit within its context. This is typically a triphone (although for some units (such as SIL) it is a simple phone. Representing the nodes as HMM instead of nodes yields a much larger tree, but also has some advantages:
Word Histories
We use explicit backoff for word histories. That technique is proven to be useful and save number of states. The reasoning is the following. With a vocabulary of size N, you have N^2 unique bigram histories. So the token stack will have N^2*K unique tokens, where K is the number of states per token. For a 100k vocab, 3 states per HMM, that will be 3*10^10 tokens (max). Of course, a large majority of them will be pruned, but really, its still way too much. If you stick with the actual K-gram used (i.e. accounting explicitly for backoff), then this reduces tremendously. Most bigrams dont have corresponding trigrams. Not all 10^10 bigrams have trigrams. We only need to store as many explicit tokens as the number of bigrams that have trigrams.
Modifier and Type | Class and Description |
---|---|
class |
LexTreeLinguist.LexTreeEndUnitState
Represents a unit in the search space
|
class |
LexTreeLinguist.LexTreeEndWordState
Represents the final end of utterance word
|
class |
LexTreeLinguist.LexTreeHMMState
Represents a HMM state in the search space
|
class |
LexTreeLinguist.LexTreeNonEmittingHMMState
Represents a non emitting hmm state
|
class |
LexTreeLinguist.LexTreeUnitState
Represents a unit in the search space
|
class |
LexTreeLinguist.LexTreeWordState
Represents a word state in the search space
|
Modifier and Type | Field and Description |
---|---|
protected boolean |
addFillerWords |
protected edu.cmu.sphinx.linguist.lextree.HMMTree |
hmmTree |
protected float |
languageWeight |
static java.lang.String |
PROP_ACOUSTIC_MODEL
The property that defines the acoustic model to use when building the search graph
|
static java.lang.String |
PROP_ADD_FILLER_WORDS
The property that controls whether filler words are automatically added to the vocabulary
|
static java.lang.String |
PROP_CACHE_SIZE
The property that defines the size of the arc cache (zero to disable the cache).
|
static java.lang.String |
PROP_DICTIONARY
The property that defines the dictionary to use for this grammar
|
static java.lang.String |
PROP_FULL_WORD_HISTORIES
The property that determines whether or not full word histories are used to
determine when two states are equal.
|
static java.lang.String |
PROP_GENERATE_UNIT_STATES
The property to control whether or not the linguist will generate unit states.
|
static java.lang.String |
PROP_GRAMMAR
The property that defines the grammar to use when building the search graph
|
static java.lang.String |
PROP_LANGUAGE_MODEL
The property for the language model to be used by this grammar
|
static java.lang.String |
PROP_UNIGRAM_SMEAR_WEIGHT
The property that determines the weight of the smear.
|
static java.lang.String |
PROP_UNIT_MANAGER
The property that defines the unit manager to use when building the search graph
|
static java.lang.String |
PROP_WANT_UNIGRAM_SMEAR
The property that determines whether or not unigram probabilities are
smeared through the lextree.
|
PROP_FILLER_INSERTION_PROBABILITY, PROP_LANGUAGE_WEIGHT, PROP_SILENCE_INSERTION_PROBABILITY, PROP_UNIT_INSERTION_PROBABILITY, PROP_WORD_INSERTION_PROBABILITY
Constructor and Description |
---|
LexTreeLinguist() |
LexTreeLinguist(AcousticModel acousticModel,
UnitManager unitManager,
LanguageModel languageModel,
Dictionary dictionary,
boolean fullWordHistories,
boolean wantUnigramSmear,
double wordInsertionProbability,
double silenceInsertionProbability,
double fillerInsertionProbability,
double unitInsertionProbability,
float languageWeight,
boolean addFillerWords,
boolean generateUnitStates,
float unigramSmearWeight,
int maxArcCacheSize) |
Modifier and Type | Method and Description |
---|---|
void |
allocate()
Allocates the linguist.
|
void |
deallocate()
Deallocates the linguist.
|
protected void |
generateHmmTree() |
Dictionary |
getDictionary() |
LanguageModel |
getLanguageModel()
Retrieves the language model for this linguist
|
SearchGraph |
getSearchGraph()
Retrieves search graph.
|
void |
newProperties(PropertySheet ps)
This method is called when this configurable component needs to be reconfigured.
|
void |
startRecognition()
Called before a recognition
|
void |
stopRecognition()
Called after a recognition
|
@S4Component(type=Grammar.class) public static final java.lang.String PROP_GRAMMAR
@S4Component(type=AcousticModel.class) public static final java.lang.String PROP_ACOUSTIC_MODEL
@S4Component(type=UnitManager.class, defaultClass=UnitManager.class) public static final java.lang.String PROP_UNIT_MANAGER
@S4Boolean(defaultValue=true) public static final java.lang.String PROP_FULL_WORD_HISTORIES
@S4Component(type=LanguageModel.class) public static final java.lang.String PROP_LANGUAGE_MODEL
@S4Component(type=Dictionary.class) public static final java.lang.String PROP_DICTIONARY
@S4Integer(defaultValue=0) public static final java.lang.String PROP_CACHE_SIZE
@S4Boolean(defaultValue=false) public static final java.lang.String PROP_ADD_FILLER_WORDS
@S4Boolean(defaultValue=false) public static final java.lang.String PROP_GENERATE_UNIT_STATES
@S4Boolean(defaultValue=true) public static final java.lang.String PROP_WANT_UNIGRAM_SMEAR
@S4Double(defaultValue=1.0) public static final java.lang.String PROP_UNIGRAM_SMEAR_WEIGHT
PROP_WANT_UNIGRAM_SMEAR
protected boolean addFillerWords
protected float languageWeight
protected edu.cmu.sphinx.linguist.lextree.HMMTree hmmTree
public LexTreeLinguist(AcousticModel acousticModel, UnitManager unitManager, LanguageModel languageModel, Dictionary dictionary, boolean fullWordHistories, boolean wantUnigramSmear, double wordInsertionProbability, double silenceInsertionProbability, double fillerInsertionProbability, double unitInsertionProbability, float languageWeight, boolean addFillerWords, boolean generateUnitStates, float unigramSmearWeight, int maxArcCacheSize)
public LexTreeLinguist()
public void newProperties(PropertySheet ps) throws PropertyException
Configurable
newProperties
in interface Configurable
ps
- a property sheet holding the new dataPropertyException
- if there is a problem with the properties.public void allocate() throws java.io.IOException
Linguist
Implementor's Note - A well written linguist will allow allocate to be called multiple times without harm. This will allow a linguist to be shared by multiple search managers.
public void deallocate() throws java.io.IOException
Linguist
Implementor's Note - if the linguist is being shared by multiple searches, the deallocate should only actually deallocate things when the last call to deallocate is made. Two approaches for dealing with this:
(1) Keep an allocation counter that is incremented during allocate and decremented during deallocate. Only when the counter reaches zero should the actually deallocation be performed.
(2) Do nothing in dellocate - just the the GC take care of things
deallocate
in interface Linguist
java.io.IOException
- if an IO error occurspublic SearchGraph getSearchGraph()
Linguist
Implementor's note: This method is typically called at the beginning of each recognition and therefore should be
getSearchGraph
in interface Linguist
public void startRecognition()
startRecognition
in interface Linguist
public void stopRecognition()
stopRecognition
in interface Linguist
public LanguageModel getLanguageModel()
public Dictionary getDictionary()
protected void generateHmmTree()