QtSpeechRecognition API for Qt Using Pocketsphinx


Qt Logo
It is really great to see the wide variety of APIs raising around Pocketsphinx, one recent new one is QtSpeechRecognition API implemented by Code-Q for assistive applications. This undertaking is quite ambitious, the main features include

  • Speech recognition engines are loaded as plug-ins.
  • Engine is controlled asynchronously, causing only minimal load to the
    application thread.
  • Built-in task queue makes plug-in development easier and forces
    unified behavior between engine integrations.
  • Engine integration handles the audio recording, making it easy to use
    from the application.
  • Application can create multiple grammars and switch between them.
  • Setting mute temporarily disables speech recognition, allowing
    co-operation with audio output (speech prompts or audio cues).
  • Includes integration to PocketSphinx engine (latest codebase) as a
    reference.

You can discuss features and find more details on the following thread in Qt mailing list. You can find the sources in review in qtspeech project, branch wip/speech-recognition.

The implementation already includes pretty interesting features, for example it intelligently saves and restores CMN state for more robust recognition. So let us see how it goes.