PocketSphinx
5.0.3
is now released. This is a patch release which adds support for
Python 3.12 and fixes a bug in the NGramModel
wrapper class.
Download source from GitHub or PyPI Yes, these are not exactly the same file.
Install binaries for Python:
pip3 install pocketsphinx
Read the API documentation for C and for Python3
Pull requests and bug reports and such are welcome via https://github.com/cmusphinx/pocketsphinx.
PocketSphinx 5.0.1 is now released. This is a patch release which fixes a number of bugs and documentation errors in PocketSphinx 5.0.0. See the link above for more detail.
Download source from GitHub or PyPI. Yes, these are not exactly the same file.
Install binaries for Python:
pip3 install pocketsphinx
Read the API documentation for C and for Python3
Pull requests and bug reports and such are welcome via https://github.com/cmusphinx/pocketsphinx.
Well, it turns out that people were using
pocketsphinx_continuous
,
at least sort of. As I expected, they weren’t really using the actual
pocketsphinx_continuous
binary for anything useful other than
recognizing from files. But, well, the code did claim to be example
code, and so obviously people were using it … as example code.
…which is a perfectly sensible thing to do, and unfortunately in removing the audio support from PocketSphinx, it became considerably less useful as an example of how to do recognition from a microphone, particularly if the solution of running SoX in a subprocess isn’t an appealing one (as on Windows, for instance).
The sensible solution to this is to bring back something like
pocketsphinx_continuous
but explicitly in the form of example code.
Adding cross-platform audio support to the library is absolutely
something I will not do, but there are some other options,
PortAudio foremost among them. So, here is
an example of using PortAudio
That said, wrangling external dependencies on Windows is very annoying. To use the above example may require a certain amount of path and environment wrangling to get CMake/VSCode/Visual Studio to find PortAudio. For this reason there is also now an example of using the Win32 Waveform Audio API directly
Note that in both cases you may have quite bad results when running a “Debug” build, because Windows is very slow, and Visual C++ outputs extremely slow code when debugging is enabled.
These examples are included in the upcoming 5.0.1 release in the
examples
directory.
There is also an updated release of SphinxTrain, and the acoustic modeling tutorial has been updated to reflect the new and simplified usage. Still working on the other tutorials, sorry.
To quote the release notes, this release fixes a few long-standing bugs in SphinxTrain and makes the package (hopefully) easier to use. Among other things:
sphinx3_align
is included)-remove_silence
option has been disabled by default (unlike in PocketSphinx you can still turn it on if you really want to, it might save you a bit of time in training)You can download it from the release page
Or clone it (shallowly) with git
:
git clone --depth 1 --branch v5.0.0 https://github.com/cmusphinx/sphinxtrain
Pull requests and bug reports and such are welcome via https://github.com/cmusphinx/sphinxbase.