CMUSphinx Open Source Speech Recognition

Jun 29, 2013

Voice-enable Your Website With CMUSphinx

It has been a long dream to voice-enable websites. However, no good technology existed for this either because speech recognition on the web required a connection to a server or due to the requirement to install binary plugin.

Great news is that you can now use CMUSphinx in any modern browser completely on the client side. No need for installation, no need to maintain voice recognition server farm. This is a really cool technology.

Sylvain Chevalier has been working on a port of Pocketsphinx to JavaScript using emscripten. Combined with the Web Audio API, it works great as a real-time recognizer for web applications, running entirely in the browser, without plug-in.

It's on Github (https://github.com/syl22-00/pocketsphinx.js),
comments, suggestions and contributions are more than welcome!

Jun 9, 2013

Pocketsphinx will be used in Ubuntu Unity 8

Months ago, Mark Shuttleworth announced and explained Ubuntu's converged vision, where a singular OS is to power phones, tablets, desktops, TVs, etc.

Presently, the Ubuntu developers are working on strengthening Ubuntu for phones (Ubuntu Touch), development where speech recognition is to probably play a relevant role.

Among the usage-cases, the speech recognition was demoed (at a mockup level) as part of the HUD 2.0, basically, allowing the user to trigger commands by pressing a button and speaking into the phone's microphone, oral command translated and applied similarly to a regular command.

Great news of today is that PocketSphinx has just landed in Ubuntu 13.10 by default being shifted from its previous universe availability (via Ubuntu Software Center) directly into main (landing via the regular updates). This means PocketSphinx is to be utilized for the upcoming Unity 8's release on the desktop, probably to allow users to fully grasp the Unity 8's features via a full spectrum of functionalities.

It's definitely just a beginning of the work but it's really great to see CMUSphinx on its way to the desktop. Definitely there will be many problems on the way since a proper implementation of the speech recognition system is not a trivial task and needs certain expertise. Your help is needed here otherwise all the issues will be assigned to Pocketsphinx.

Mar 30, 2013

Speech recognition on Kindle Touch with CMUSphinx

We are happy to announce that CMUSphinx-powered speech recognition comes to Amazon Kindle. "Vague" or Voice Activated GUi Extension was recently introduced and already available for your Kindle with Kual, a unified launcher

If you have an old Kindle sitting around, or you just want to get a little more out of the one use every day, jailbreaking is simple. Once you jailbreak, KUAL is a worthwhile little application launcher that gives you easy access to what you download.

KUAL works with pretty much every single Kindle model. Once it's installed, you can run the program, and you're given a simple, easy to use launcher to access everything on your Kindle. That means games, VNC clients, apps, and plenty more. It's a nifty little launcher, and the fact it works on pretty much every Kindle out there makes it simple to use.

Vague allows you to navigate through your bookreader, launch various tools and, more importantly, it's highly extensible in mind. That means that you can add your own commands easily with just a simple script!

Great job done!

To learn more how it works and see the live demonstration see the video:

CMUSphinx is a great choice if you want to add recognition to your mobile or embedded device. If you are interested in help, please join #cmusphinx irc channel on Freenode.net, we would be glad to help you to configure and run it.

Feb 16, 2013

OpenEars version 1.3.0 Preview Is Available

Recently, a new version of OpenEars is announced. The main feature of a new release 1.3.0 is an upgrade to the latest CMUSphinx codebase pocketsphinx-0.8. This upgrade should bring additional stability and performance, so you are welcome to try it!

OpenEars is the most popular free offline speech recognition and text-to-speech framework on iOS, and the basis for the OpenEars Platform, a plugin system that lets you drag-and-drop new speech capabilities into your iOS app.

If you are interested in examples of the applications built with CMUSphinx and OpenEars framework, please visit this cool project. Photo editing can be a challenging task, and it becomes even more difficult on the small, portable screens such as camera phones that are now frequently used to edit images. To address this problem PixelTone, a multimodal photo editing interface that combines speech and direct manipulation was created:

This truely creative application demonstrates how powerful multimodal framework could be created with CMUSphinx. Your application could be the next voice-enabled one!

Newer

Older

Page 14 of 37