CMUSphinx Documentation
This page contains collaboratively developed documentation for the CMU Sphinx speech recognition engines.
Beginner User Documentation
This section contains links to documents which describe how to use Sphinx to recognize speech. Currently, we have very little in the way of end-user tools, so it may be a bit sparse for the forseeable future.
- Tutorial: Getting started with CMUSphinx for developers
- Basic concepts of speech
- Overview of the CMUSphinx toolkit
- Before you start
- Building application using pocketsphinx
- Building application using sphinx4
- Using pocketsphinx on Android
- Building language models
- Adapting existing acoustic model
- Building the acoustic model
- Building a dictionary
- Debugging speech recognition accuracy
- PocketSphinx for pronunciation evaluation
You are in trouble - read the FAQ
See also some more docs:
-
Decoder Versions: Description of the software packages
-
Download Details: How to obtain CMUSphinx packages
-
How to get help and discuss things: How to get help and discuss things
If you want to find out where CMUSphinx works, see
- Projects that use Sphinx: These projects, both commercial and free, use Sphinx in one form or another.
Advanced User Documentation
These documents either describe some particular aspect of the Sphinx codebase in detail, or they serve as a developer’s guide to accomplishing some particular task.
-
Building: Building Pocketsphinx on various platforms
-
AsteriskDetails: How to use pocketsphinx in Asterisk.
-
DecoderTuning: How to tune the decoder to be fast (or rather, not horribly slow)
-
PocketsphinxHandhelds Pocketsphinx optimizations for embedded devices, same as above for Pocketsphinx.
-
PhonemeRecognition: How to use pocketsphinx for phoneme recognition.
-
SpeakerDiarization: Using LIUM tools for speech segmentation and speaker diarization
-
LDAMLLT: How to train acoustic models with LDA and MLLT feature transforms
-
GStreamer: How to use PocketSphinx with GStreamer and Python
-
InstallingPythonStuff: How to install Python and necessary modules for SphinxTrain development
-
MMIE_Train: How to perform MMIE training.
How To Contribute
Please consider project ideas ProjectIdeas, some of them are easy, some harder. If you want to start work on any of them, please let us know.
Reference
These documents describe the excruciating detail of APIs, or provide other useful background information for CMUSphinx developers.
Developer Documentation
This section contains various internal information for CMUSphinx developers. But we hope it will be still usable for you.
-
Sphinx-4 Regression Tests: How to run regression tests
-
SphinxTrainWalkthrough: An overview of the SphinxTrain source code for researchers and developers
-
CMUCLMTK development: Development guide for the CMU-Cambridge Language Modeling Toolkit.
-
CodingStyle for SphinxBase, SphinxThree, and SphinxTrain
-
ReleaseSchedule: Plans for upcoming releases of Sphinx
-
Release Check List: How to make a release
-
Web Site Layout: How to organize information
-
Sphinx4 Space : Information about sphinx4, design, code, performance, history.
File formats
Data sources
Available data sources are covered on the page SpeechData
Speech Recognition Theory
This section tries to collect research ideas for specific problems in speech recognition