New Model-In-Jar file format

The latest version of Sphinx4 has a new and improved model system.

Now all acoustic and language models are loaded as normal files in directories.
File paths are specified as URIs and therefore may exist anywhere on the Internet.
In addition, for convenience, "resource:" causes Sphinx4 to look on the classpath
for a file.

Special Models and ModelLoaders are no longer required, and resource specifiers
no longer require the clumsy resource:!/ syntax.

Kudos to Peter

JSGF Refactoring in sphinx4

The major refactoring of JSAPI part of sphinx4 happened recently. The roots of it lie deep in the history of sphinx4. From birth in the previous century sphinx4 was going to support industy-streight standards in particular java speech API. Actually sphinx4 was a playground for JSAPI development.

The code for JSAPI support as any unsupported code that was written long time ago was rather hard to read and modify. And, the most important, JSAPI structures were used everywhere in it. That would be ok if JSAPI was free and distributed in sources, unfortunately it's not the case. It goes under restrictive license that prevents free redistribution. That was the major problem, and I bet you meet it when you started sphinx4 development first time and forgot to unpack jsapi.jar and agree with it's license. Not to mention that implementation of the API was incomplete, basically you could only play with grammars, nothing more. No real recognizer API was supported.

Now this sutuation changed drastically:

  • JSGF Parser and grammars are now parts of sphinx4.jar, free from any licence issues. You could use JSGF grammars as any other part of api.
  • Implementation of JSAPI now is built on top of sphinx4, making it easy to split it, test it and use it.
  • New JSAPI code looks more or less modern.
  • Functional JSAPI-1.0 implementation with recognition part implemented will be here soon.
  • In a near future JSAPI-2.0 interface will probably arrive.

Such big architectural changes aren't smooth of couse, regressions are expected. Don't hestitate to report them  :).

CMU Sphinx Users and Developers Workshop 2010

13 March 2010, Dallas, TX, USA

Event URL: http://www.cs.cmu.edu/~sphinx/Sphinx2010

Papers are solicited for the CMU Sphinx Workshop for Users and Developers (CMU-SPUD 2010), to be held in Dallas, Texas as a satellite to to ICASSP 2010.

Important Dates

Paper submission: 11 December 2009
Notification of paper acceptance: 15 January 2010
Workshop: 13 March 2010

CMU Sphinx is one of the most popular open source speech recognition systems. It is currently used by researchers and developers in many locations world-wide, including universities, research institutions and in industry. CMU Sphinx's liberal license terms has made it a significant member of the open source community and has provided a low-cost way for companies to build businesses around speech recognition.

The first SPUD workshop aims at bringing together CMU Sphinx users, to report on applications, developments and experiments conducted using the system. This workshop is intended to be an open forum that will allow different user communities to become better acquainted with each other and to share ideas. It is also an opportunity for the community to help define the future evolution of CMU Sphinx.

We are planning a one-day workshop with a limited number of oral presentations, chosen for breadth and stimulation, held in an informal atmosphere that promotes discussion. We hope this workshop will expose participants to different perspectives and that this in turn will help foster new directions in research, suggest interesting variations on current approaches and lead to new applications.

Papers describing relevant research and new concepts are solicited on, but not limited to, the following topics. Papers must describe work performed with CMU Sphinx:

  • Decoders: PocketSphinx, Sphinx-2, Sphinx-3, Sphinx-4
  • Tools: SphinxTrain, CMU/Cambridge SLM toolkit
  • Innovations / additions / modifications of the system
  • Speech recognition in various languages
  • Innovative uses, not limited to speech recognition
  • Commercial applications
  • Open source projects that incorporate Sphinx
  • Novel demonstration

Manuscripts must be between 4 and 6 pages long, in standard ICASSP double-column format. Accepted papers will be published in the workshop proceedings.

Organizers

Bhiksha Raj - Carnegie Mellon University
Evandro Gouvêa - Mitsubishi Electric Research Labs
Richard Stern - Carnegie Mellon University
Alex Rudnicky - Carnegie Mellon University
Rita Singh - Carnegie Mellon University
David Huggins-Daines - Carnegie Mellon University
Nickolay Shmyrev - Nexiwave
Yannick Estève - Laboratoire d'Informatique de l'Université du Maine

Contact

To email the organizers, please send email to sphinx+workshop@cs.cmu.edu

New Website

CMU Sphinx is going to rule the world. Leading speech recognition engine goes user friendly! Join us now!