PocketSphinx 0.6.1 release

This is a bugfix release, addressing a number of important issues in the 0.6 release. Specifically:

  • The GStreamer plugin was broken with old versions of GStreamer (such as the one shipped on Nokia Internet Tablets) which did not accept "BSD" as a valid license type.
  • Runtime performance of the statistical LM based decoder was significantly worse than 0.5 when not using phoneme lookahead.  It is now about 10% faster and also uses less memory.
  • The FSG decoder now consumes drastically less memory and CPU power.  (It is still not as good as it should be, but no longer completely embarrassing)
  • The Python modules were undocumented.  Docstrings compatible with epydoc have been added.
  • Raw audio logging (-rawlogdir) was broken.
  • Adding new words works properly, including words with unknown triphones (which never worked before).
  • The SphinxBase Python module can now use class-based language models.
  • The pitch estimation utility (sphinx_pitch) now builds properly on Win32.
  • Some problems with the N-Gram iterator API have been fixed.
  • Bestpath search has been fixed to handle recognition failure gracefully.

Source code is available for download on SourceForge.  Binary packages for Ubuntu will be updated shortly.

Pocketsphinx Talk on Pycon

Talk by our David

CMUSphinx Participates in Free Software Ecosystem

Over the years CMUSphinx become more and more popular. You probably would be interested to know that all last three month CMUSphinx gets into 25 most active projects on SourceForge. And that's not suprising.

A set of new interesting projects are started nowdays using CMUSphinx, to mention a few for example there is Vedics, a desktop control system or OpenCast project, an e-learning platorm. Others are listed in our wiki.

But we certainly need more. If you've done your term project with CMUSphinx, don't stop on that. Develop a new featured application!

VocalKit: Shim for Speech Recognition on iPhone

Brian King wrote a blog post on getting pocket sphinx up and running on iPhone a while back and got a few emails last week asking for help.  He was so amazing so he cleaned up the code and made a little library for it:

http://github.com/KingOfBrian/VocalKit

This should give you a library that statically links the sphinix libraries and a simple API that connects to Audio Queue.  It also comes with a test program so you should be able to have a demo up and working very quickly inside of XCode.

The plans are to add to the API to support dynamic language model creation, but the main goal is to get people up and running as soon as possible.  He would appreciate any feedback!