We are happy to announce that CMUSphinx-powered speech recognition comes to Amazon Kindle. "Vague" or Voice Activated GUi Extension was recently introduced and already available for your Kindle with Kual, a unified launcher
If you have an old Kindle sitting around, or you just want to get a little more out of the one use every day, jailbreaking is simple. Once you jailbreak, KUAL is a worthwhile little application launcher that gives you easy access to what you download.
KUAL works with pretty much every single Kindle model. Once it's installed, you can run the program, and you're given a simple, easy to use launcher to access everything on your Kindle. That means games, VNC clients, apps, and plenty more. It's a nifty little launcher, and the fact it works on pretty much every Kindle out there makes it simple to use.
Vague allows you to navigate through your bookreader, launch various tools and, more importantly, it's highly extensible in mind. That means that you can add your own commands easily with just a simple script!
Great job done!
To learn more how it works and see the live demonstration see the video:
CMUSphinx is a great choice if you want to add recognition to your mobile or embedded device. If you are interested in help, please join #cmusphinx irc channel on Freenode.net, we would be glad to help you to configure and run it.
Recently, a new version of OpenEars is announced. The main feature of a new release 1.3.0 is an upgrade to the latest CMUSphinx codebase pocketsphinx-0.8. This upgrade should bring additional stability and performance, so you are welcome to try it!
OpenEars is the most popular free offline speech recognition and text-to-speech framework on iOS, and the basis for the OpenEars Platform, a plugin system that lets you drag-and-drop new speech capabilities into your iOS app.
If you are interested in examples of the applications built with CMUSphinx and OpenEars framework, please visit this cool project. Photo editing can be a challenging task, and it becomes even more difficult on the small, portable screens such as camera phones that are now frequently used to edit images. To address this problem PixelTone, a multimodal photo editing interface that combines speech and direct manipulation was created:
This truely creative application demonstrates how powerful multimodal framework could be created with CMUSphinx. Your application could be the next voice-enabled one!
Pocketsphinx is a great alternative to a closed-source vendor SDK's due to it's open source nature, extensibility and features. If you are looking to impelment a speech application on Android, feel free to try Pocketsphinx. To get started, you can use existing applications like Inimesed
It's a great application to select contacts, you can install it on your device with a single click.
The sources and related things are available on the Github. Many thanks to Kaarel Kaljurand for his great software!
If you know some other applications using CMUSphinx, feel free to share!
A new English language model is available (updated) for download on our new Torrent tracker.
This is a good trigram language model for a general transcription trained on a various open sources, for example Guttenberg texts.
It archives the good transcription performance on various types of
texts, for example on the following tests sets the perplexities are:
Perplexity: 158.3
Perplexity: 206.677
Beside the transcription task, this model should be significantly better on conversational data like movie transcription.
The language model was pruned with a beam 5e-9 to reduce the model. It can be pruned further if needed or a vocabulary could be reduced to fit the target domain.