public class BinaryLoader
extends java.lang.Object
Note that all probabilities in the grammar are stored in LogMath log base format. Language Probabilities in the language model file are stored in log 10 base. They are converted to the LogMath base.
Constructor and Description |
---|
BinaryLoader(java.io.File location,
java.lang.String format,
boolean applyLanguageWeightAndWip,
float languageWeight,
double wip,
float unigramWeight)
Initializes the binary loader
|
BinaryLoader(java.lang.String format,
boolean applyLanguageWeightAndWip,
float languageWeight,
double wip,
float unigramWeight)
Initializes the binary loader
|
Modifier and Type | Method and Description |
---|---|
void |
deallocate() |
boolean |
getBigEndian()
Returns true if the loaded file is in big-endian.
|
long |
getBigramOffset()
Returns the location (or offset) into the file where bigrams start.
|
float[] |
getBigramProbabilities()
Returns all the bigram probabilities.
|
int |
getBytesPerField()
Returns the multiplier for the size of a NGram
(1 for 16 bits, 2 for 32 bits).
|
int |
getLogBigramSegmentSize()
Returns the log of the bigram segment size
|
int |
getLogNGramSegmentSize()
Returns the log of the NGram segment size
|
int |
getMaxDepth()
Returns the maximum depth of the language model
|
float[] |
getNGramBackoffWeights(int n)
Returns all the NGram backoff weights at
a specified N order.
|
long |
getNGramOffset(int n)
Returns the location (or offset) into the file where NGrams start
at a specified N order.
|
float[] |
getNGramProbabilities(int n)
Returns all the NGram probabilities at
a specified N order.
|
int[] |
getNGramSegments(int n)
Returns the NGram segment table at
a specified order.
|
int |
getNumberBigrams()
Returns the number of bigrams
|
int |
getNumberNGrams(int n)
Returns the number of NGrams at
a specified N order.
|
int |
getNumberTrigrams()
Returns the number of trigrams
|
int |
getNumberUnigrams()
Returns the number of unigrams
|
float[] |
getTrigramBackoffWeights()
Returns all the trigram backoff weights
|
long |
getTrigramOffset()
Returns the location (or offset) into the file where trigrams start.
|
float[] |
getTrigramProbabilities()
Returns all the trigram probabilities.
|
int[] |
getTrigramSegments()
Returns the trigram segment table.
|
edu.cmu.sphinx.linguist.language.ngram.large.UnigramProbability[] |
getUnigrams()
Returns all the unigrams
|
java.lang.String[] |
getWords()
Returns all the words.
|
byte[] |
loadBuffer(long position,
int size)
Loads the contents of the memory-mapped file starting at the given position and for the given size, into a byte
buffer.
|
protected void |
loadModelLayout(java.io.InputStream inputStream)
Loads the language model from the given file.
|
public BinaryLoader(java.io.File location, java.lang.String format, boolean applyLanguageWeightAndWip, float languageWeight, double wip, float unigramWeight) throws java.io.IOException
location
- location of the modelformat
- file formatapplyLanguageWeightAndWip
- if true apply language weight and word insertion penaltylanguageWeight
- language weightwip
- word insertion probabilityunigramWeight
- unigram weightjava.io.IOException
- if an I/O error occurspublic BinaryLoader(java.lang.String format, boolean applyLanguageWeightAndWip, float languageWeight, double wip, float unigramWeight)
format
- file formatapplyLanguageWeightAndWip
- if true apply language weight and word insertion penaltylanguageWeight
- language weightwip
- word insertion probabilityunigramWeight
- unigram weightpublic void deallocate() throws java.io.IOException
java.io.IOException
public int getNumberUnigrams()
public int getNumberBigrams()
public int getNumberTrigrams()
public int getNumberNGrams(int n)
n
- the desired orderpublic edu.cmu.sphinx.linguist.language.ngram.large.UnigramProbability[] getUnigrams()
public float[] getBigramProbabilities()
public float[] getTrigramProbabilities()
public float[] getTrigramBackoffWeights()
public int[] getTrigramSegments()
public int getLogBigramSegmentSize()
public float[] getNGramProbabilities(int n)
n
- the desired orderpublic float[] getNGramBackoffWeights(int n)
n
- the desired orderpublic int[] getNGramSegments(int n)
n
- the desired orderpublic int getLogNGramSegmentSize()
public java.lang.String[] getWords()
public long getBigramOffset()
public long getTrigramOffset()
public long getNGramOffset(int n)
n
- the desired orderpublic int getMaxDepth()
public boolean getBigEndian()
public int getBytesPerField()
public byte[] loadBuffer(long position, int size) throws java.io.IOException
position
- the starting position in the filesize
- the number of bytes to loadjava.io.IOException
- if IO went wrongprotected void loadModelLayout(java.io.InputStream inputStream) throws java.io.IOException
inputStream
- stream to read the language model datajava.io.IOException
- if IO went wrong