We are pleased to announce that CMUSphinx project is accepted to Google Summer Of Code 2011 program. That will enable us to help several students to start their way in speech recognition, open source development and in CMUSphinx. We are really excited about that.
If you are interested to participate as a student, an application period will open soon but it's better to start preparation of your application right now. Feel free to contact us for any questoins! For more details see:
If you would like to be a mentor please sign in into gsoc web application and add your ideas to the ideas list:
We invite you to participate!
The problem is that with a complexity of ASR algorithms it's very hard to implement them all. While some of them are sometimes better, some are worse. For specific application you can always choose most reasonable approach but it may be not readily available in your system and it might be quite resource-consuming to implement them. That's why frameworks
like CMUSphinx are valuable for both researchers and speech application developers. That's why we are so happy to see your contributions to CMUSphinx.
Good example of this is a set of approaches to train MLLR transform. Basically there is MLLR where mean and variance of the gaussians are estimated alone or CMLLR where mean and variance of the gaussian distribution are estimated together. CMLLR is more complex to estimate but because of smaller amount of parameters it does make sense to apply
it when your adaptation data is small. For example if you have just a minute of speech to adapt, CMLLR can give you better results than MLLR.
Why do we write this today you'll ask? Easy. Today CMLLR estimation code landed in Sphinxtrain trunk. See the file
cmllr.py. Thanks a lot to Stephan Vanni who contributed that part, that's really valuable addition! Enjoy!
It's interesting that CMUSphinx gives the developers all over the world the ability to build speech systems, interact with voice and build something unique and useful
This Kinect imitation is not really very impressive or complicated speech recognition, it's all about freedom of creativity. If you want to build interactive application, just write few lines of code and it will work. In many languages, for many people.
And we'll be satisfied as well.
Please consider Sphinx4-based spoken dialog system toolkit demonstration http://www.okkoblog.com/2010/11/04/inprotk-demonstration/ by University of Potsdam and the Inpro Group. Few interesting improvements to the core are there such as prosody-driven end-of-turn classification, mid-utterance action execution and display of partial ASR hypotheses