There were two great development meetings in Dallas during ICASSP. The goal was to develop some roadmap document about what will happen in near future. First meeting was about discussions and second meeting was for review of statements and to make an action plan. Attendees were
Both meetings:
Only first meeting:
The following topics were discussed.
Development directions. It's long waited to implement WFST-based decoder as part of CMUSphinx tools. Such decoders are considered to be very interesting because language models, acoustic models, dictionary and even result lattices are unified under common data framework. Training could be done using openfst tools. Such framework could appear in a near future.
Development directions. CMU starts a project at CISL dedicated to building a methods to support almost all languages on Earth. That will include collection of data for semi-supervised model training,
automatic selection of the dictionary, language modeling and many other interesting things. More updates on this project will appear soon.
Sphinx4-1.0 release. Right now sphinx4 is not as good as sphinx3 for the following reasons:
The issues here are listed in order of their complexity. Once those issues will be resolved we can deprecate sphinx3 and release sphinx4-1.0. There is concern about using Java for most accurate
recognizer, we need to run poll on that issue and also we could suggest pocketsphinx as sphinx3 replacement for resource-constrained environments.
Documentation. We really want to improve the quality of documentation. It means we'll try to create consistent online documentation with howto's, video tutorials and many other things as well as good printed documentation as it's also very important. The following things would be nice to do in near future:
Web-Service. Web services in particular lmtool proved to be very successful because of low entrance cost to try the system. We need to develop web infrastructure in various ways. Since this requires more control over the system and also more computational resources we have to setup
cluster to provide services:
We'll also provide a live system image for CMU Sphinx tools to lower barrier to try CMU Sphinx.
Funding. We have a number of very things to be done. Since many of them require significant resources it would be nice to have an organization that will be able to fund the development and infrastructure maintaince. The example of such organizations are 503(c)(6) non-profits like Apache
Foundation. Suggestions are welcome.
LIUM suggestions. LIUM is doing amazing work on CMUSphinx project and we would be glad to make it merged. During presentation LIUM raised the following issues:
On our side there are following concerns:
We would be glad to discuss those things. Follow up on this will be posted soon.
Various bits. We'll try to improve sphinx4 as time goes. Some bits to mention:
We applied for Google SoC programm 2010 but were rejected. Anyway we appreciate Google's contribution to open souce development. We wish all accepted projects to successfully finish the program this year and we wish good luck to all the students who will participate.
As for us, we still have a lot of tasks that every newbie could put hands on
https://cmusphinx.github.io/wiki/summerofcodeideas
We would be glad to guide anyone who wants to start with them. If you are a student and want to learn more about speech recognition, it's your chance to jump in. We are also open for sponsorship suggestions for this task.
So, CMU Sphinx workshop in Dallas is over. Let us congratulate all participants especially submission authors. That was a great event, the room was full! We were amazed by number of people who attended, their passion and interest in CMU Sphinx. We would be glad to see more participants next year!
For those who missed the workshop, the papers and some slides are available on the website. Certainly you could find something interesting there like new feature release announcements, applications details and new research topics. We didn't forget to support ASR research of course. Workshop was recorded by many recording devices of various types and this data will serve as a database for meeting transcription system.
Of course, the most important side of being on workshop is face-to-face communication. It was important for us to collect and address concerns of our users. Main issues noted were the following:
Luckily problems above are mostly organizational issues. There were two development meetings after the workshop to address them. Expect a new announcement about it soon.
We would be glad to continue discussions about CMU Sphinx. Please subscribe to the development mailing list https://lists.sourceforge.net/lists/listinfo/cmusphinx-devel. We would be glad to answer your questions and would appreciate your suggestions.
We are pleased to announce the long-awaited PocketSphinx 0.6 release, including SphinxBase 0.6. This release corresponds to SVN revision 9898.
PocketSphinx is a small-footprint continuous speech recognition system, freely licensed under a simplified BSD license, suitable for handheld and desktop applications. It features:
The release branch can be accessed via Subversion at http://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/branches/pocketsphinx-0.6 - this is the preferred way to access the release, particularly if you are using Windows.
This exact release tag can be accessed at http://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/tags/pocketsphinx-0.6
Source code archives are now available for download at http://sourceforge.net/projects/cmusphinx/
Debian/Ubuntu source packages are available from https://launchpad.net/~dhuggins/+archive/cmusphinx