SphinxTrain 5.0.0 is released!

There is also an updated release of SphinxTrain, and the acoustic modeling tutorial has been updated to reflect the new and simplified usage. Still working on the other tutorials, sorry.

To quote the release notes, this release fixes a few long-standing bugs in SphinxTrain and makes the package (hopefully) easier to use. Among other things:

  • The dependency on SphinxBase is gone, because SphinxBase is gone
  • The dependency on Sphinx3 for VTLN and force-alignment is gone (sphinx3_align is included)
  • Multi-CPU training actually works, tested on up to 64 CPUs with LibriSpeech, much easier than setting up PBS on the Clown
  • The dependency on Visual Studio for buliding on Windows is gone (but please just use WSL, please)
  • The dependency on Autotools is gone (CMake ain’t great but it’s much less bad)
  • There is a Dockerfile now
  • There is “continuous integration” now (sort of)
  • The -remove_silence option has been disabled by default (unlike in PocketSphinx you can still turn it on if you really want to, it might save you a bit of time in training)
  • It is not necessary to install SphinxTrain system-wide to run training
  • G2P support has been updated for the most commonly installed version of OpenFST (do not try to use any other version, because C++, that’s why)
  • Pick Decoding Model Based on Context Dependence by @Mazyod
  • Output an error message when we cannot execute a tool by @cshung
  • Make an option in config for not folding case in phonemes by @lenzo-ka
  • Use consistent shebang for python by @acgrobman
  • Add -sox flag to sphinx_fe to convert files with SoX by @dhdaines
  • Update and enable G2P code by @dhdaines
  • Librispeech training template by @dhdaines

You can download it from the release page

Or clone it (shallowly) with git:

git clone --depth 1 --branch v5.0.0 https://github.com/cmusphinx/sphinxtrain

Pull requests and bug reports and such are welcome via https://github.com/cmusphinx/sphinxbase.

PocketSphinx 5.0.0 is released!

It is finally here, PocketSphinx 5.0.0

Download source from GitHub or PyPI. Yes, these are not exactly the same file.

Install binaries for Python:

pip3 install pocketsphinx

Read the API documentation for C and for Python3

Pull requests and bug reports and such are welcome via https://github.com/cmusphinx/pocketsphinx.

PocketSphinx 5.0.0 release candidate 5

Executive Summary: Please try this one, there won’t be another.

Yes, it’s that time of week again, time for another release candidate You can also download it from PyPI.

There are a lot of changes so I suggest you look at the release notes at the link above. Python code should continue to work as before, though you may get some deprecated warnings when you try to use the inappropriately named set_{fsg,lm,kws} methods. Don’t use them, they have the wrong names, use add_* instead. The names were changed because they don’t set anything and you have to actually activate the search module afterwards. Now you use ps_activate_search() to do that, and not ps_set_search(), because this, too, is a much better name.

That’s actually the least of it. The big news is that force-alignment and subword alignment are now quite doable, from the command-line, from the C API, and from Python. There are some tests and examples for you to look at.

The last known portability issue (which was actually, like, a bug) is fixed and you won’t get unpredictable and bad results on MIPS systems. There are surely others, though. Ideally our CI testing would run things on various emulators, but it’s slow and unwieldy to do that.

The JSGF compiler is back to producing unreasonable numbers of epsilon transitions (“null” transitions for the less FST-aware), but it produces correct output now.

Pull requests and bug reports and such are welcome via https://github.com/cmusphinx/pocketsphinx.

PocketSphinx 5.0.0 release candidate 4

Executive Summary: Alas, poor SphinxBase!

Yes, it’s that time of week again, time for another release candidate. You can also download it from PyPI.

In the spirit of total elimination, the major change here is the disappearance of the <sphinxbase/*.h> headers. Some of them have been relocated, so if you include <pocketsphinx.h> you can still do useful things like load and save language models and parse JSGF. Oh, and also do speech recognition, maybe.

There are a number of other things you can’t do, because the “utility” headers were mostly unsuitable for public consumption. Really they were a bit embarrassing, at least in 2022. A major rationale for removing SphinxBase from circulation is that it just isn’t a good foundation for you to build “applications” or anything else really. Like, there are at least a dozen better implementations of pretty much everything in there, and you should really use them. Command-line parsing, for instance, should not be done with <cmd_ln.h>, so it has been hidden from you to discourage you from trying that.

Which brings us to the other major breaking change here. Configuration is not done by parsing (possibly imaginary) command lines anymore. You can simply create a configuration and set values in it, e.g.:

ps_config_t *config = ps_config_init(NULL);
ps_config_set_str(config, "hmm", "/path/to/model");
ps_config_set_int(config, "samprate", 11025);

You can also parse JSON, or even a sort of degenerate “JSON”:

ps_config_t *config = ps_config_parse_json(
    NULL, "{\"hmm\": \"/path/to/model\"}");
ps_config_t *config = ps_config_parse_json(
    NULL, "hmm: /path/to/model, samprate: 11025");

The configuration can be serialized to (actual) JSON as well:

const char *jconf = ps_config_serialize_json(config);

Creating a ps_config_t sets all of the default values, but does not set the default model, so you still need to use ps_default_search_args() for that. Also note that ps_expand_model_config() no longer creates magical underscore versions of the config parameters (e.g. "_hmm", "_dict", etc) but simply overwrites the existing values.

Python code is entirely unaffected by these changes (though it has also acquired the JSON functions mentioned above), so you should maybe use Python instead of hurting yourself with the C API.

Pull requests and bug reports and such are welcome via https://github.com/cmusphinx/pocketsphinx.