USEnglishTokenizer (sphinx4-core 5prealpha-SNAPSHOT API)

java.lang.Object
- edu.cmu.sphinx.alignment.USEnglishTokenizer

All Implemented Interfaces:

TextTokenizer
```
public class USEnglishTokenizer
extends java.lang.Object
implements TextTokenizer
```
Converts the Tokens (in US English words) in an Utterance into a list of words. It puts the produced list back into the Utterance. Usually, the tokens that gets expanded are numbers like "23" (to "twenty" "three").
* It translates the following code from flite:
lang/usenglish/us_text.c

Constructor Summary

Constructors
Constructor and Description

USEnglishTokenizer()
Constructs a default USTokenWordProcessor.

Constructors
Constructor and Description
`USEnglishTokenizer()` Constructs a default USTokenWordProcessor.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`java.util.List<java.lang.String>`	`expand(java.lang.String text)` Process the utterance
`Item`	`getTokenItem()` Returns the currently processing token Item.
`boolean`	`isPronounceable(java.lang.String word)` Returns true if the given word is pronounceable.
`static boolean`	`kingLike(Item tokenItem)` Returns true if the given token item contains a token that is in a king-like context, e.g., "King" or "Louis".
`static boolean`	`sectionLike(Item tokenItem)` Returns true if the given token item contains a token that is in a section-like context, e.g., "chapter" or "act".

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - USEnglishTokenizer
```
public USEnglishTokenizer()
```
    Constructs a default USTokenWordProcessor. It uses the USEnglish regular expression set (USEngRegExp) by default.
- Method Detail
  - getTokenItem
```
public Item getTokenItem()
```
    Returns the currently processing token Item.
    
    Returns:
    
    the current token Item; null if no item
  - expand
```
public java.util.List<java.lang.String> expand(java.lang.String text)
```
    Process the utterance
    
    Specified by:
    
    expand in interface TextTokenizer
    
    Parameters:
    
    text - the utterance containing the tokens
    
    Returns:
    
    the list of tokens
  - kingLike
```
public static boolean kingLike(Item tokenItem)
```
    Returns true if the given token item contains a token that is in a king-like context, e.g., "King" or "Louis".
    
    Parameters:
    
    tokenItem - the token item to check
    
    Returns:
    
    true or false
  - sectionLike
```
public static boolean sectionLike(Item tokenItem)
```
    Returns true if the given token item contains a token that is in a section-like context, e.g., "chapter" or "act".
    
    Parameters:
    
    tokenItem - the token item to check
    
    Returns:
    
    true or false
  - isPronounceable
```
public boolean isPronounceable(java.lang.String word)
```
    Returns true if the given word is pronounceable. This method is originally called us_aswd() in Flite 1.1.
    
    Parameters:
    
    word - the word to test
    
    Returns:
    
    true if the word is pronounceable, false otherwise

Class USEnglishTokenizer

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

USEnglishTokenizer

Method Detail

getTokenItem

expand

kingLike

sectionLike

isPronounceable