Interesting enough, revision number 10000 was committed today to the SVN repository. Well, we are looking forward to see the revision number 100k, delivering you the best ASR experience.
A number of minor issues have been found in the PocketSphinx 0.6 release. We will be preparing a 0.6.1 release to address these, but if you are affected by them, you can track the stable branch of the source code at http://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/branches/pocketsphinx-0.6 .
For Ubuntu users, source packages and binary packages for Lucid Lynx (10.04 beta) which track the bug-fixes can be found at https://launchpad.net/~dhuggins/+archive/cmusphinx - or simply add ppa:dhuggins/cmusphinx to your list of software sources.
There were two great development meetings in Dallas during ICASSP. The goal was to develop some roadmap document about what will happen in near future. First meeting was about discussions and second meeting was for review of statements and to make an action plan. Attendees were
Only first meeting:
The following topics were discussed.
Development directions. It's long waited to implement WFST-based decoder as part of CMUSphinx tools. Such decoders are considered to be very interesting because language models, acoustic models, dictionary and even result lattices are unified under common data framework. Training could be done using openfst tools. Such framework could appear in a near future.
Development directions. CMU starts a project at CISL dedicated to building a methods to support almost all languages on Earth. That will include collection of data for semi-supervised model training,
automatic selection of the dictionary, language modeling and many other interesting things. More updates on this project will appear soon.
Sphinx4-1.0 release. Right now sphinx4 is not as good as sphinx3 for the following reasons:
The issues here are listed in order of their complexity. Once those issues will be resolved we can deprecate sphinx3 and release sphinx4-1.0. There is concern about using Java for most accurate
recognizer, we need to run poll on that issue and also we could suggest pocketsphinx as sphinx3 replacement for resource-constrained environments.
Documentation. We really want to improve the quality of documentation. It means we'll try to create consistent online documentation with howto's, video tutorials and many other things as well as good printed documentation as it's also very important. The following things would be nice to do in near future:
Web-Service. Web services in particular lmtool proved to be very successful because of low entrance cost to try the system. We need to develop web infrastructure in various ways. Since this requires more control over the system and also more computational resources we have to setup
cluster to provide services:
We'll also provide a live system image for CMU Sphinx tools to lower barrier to try CMU Sphinx.
Funding. We have a number of very things to be done. Since many of them require significant resources it would be nice to have an organization that will be able to fund the development and infrastructure maintaince. The example of such organizations are 503(c)(6) non-profits like Apache
Foundation. Suggestions are welcome.
LIUM suggestions. LIUM is doing amazing work on CMUSphinx project and we would be glad to make it merged. During presentation LIUM raised the following issues:
On our side there are following concerns:
We would be glad to discuss those things. Follow up on this will be posted soon.
Various bits. We'll try to improve sphinx4 as time goes. Some bits to mention:
We applied for Google SoC programm 2010 but were rejected. Anyway we appreciate Google's contribution to open souce development. We wish all accepted projects to successfully finish the program this year and we wish good luck to all the students who will participate.
As for us, we still have a lot of tasks that every newbie could put hands on
We would be glad to guide anyone who wants to start with them. If you are a student and want to learn more about speech recognition, it's your chance to jump in. We are also open for sponsorship suggestions for this task.