It is really great to see the wide variety of APIs raising around Pocketsphinx, one recent new one is QtSpeechRecognition API implemented by Code-Q for assistive applications. This undertaking is quite ambitious, the main features include
- Speech recognition engines are loaded as plug-ins.
- Engine is controlled asynchronously, causing only minimal load to the
application thread. - Built-in task queue makes plug-in development easier and forces
unified behavior between engine integrations. - Engine integration handles the audio recording, making it easy to use
from the application. - Application can create multiple grammars and switch between them.
- Setting mute temporarily disables speech recognition, allowing
co-operation with audio output (speech prompts or audio cues). - Includes integration to PocketSphinx engine (latest codebase) as a
reference.
You can discuss features and find more details on the following thread in Qt mailing list. You can find the sources in review in qtspeech project, branch wip/speech-recognition.
The implementation already includes pretty interesting features, for example it intelligently saves and restores CMN state for more robust recognition. So let us see how it goes.