Long Audio Training: Update 2


The past-days-work addressed the usage of restrictive function read_line in the SphinxTrain. All occurrences of read_line were eliminated making use of line iterators from sphinxbase. During this a decission to modify the lineiter interface was made in order to support original read_line functionality, e.g. comments skipping and whitespace trimming. This now takes following methods:

  • lineiter_init - init iterator for reading without any preprocessing
  • lineiter_init_clean - init iterator for reading compatible with the read_line function: skip commented lines and trim leading and trailing whitespaces
  • lineiter_next - read next line from the file
  • lineiter_free - finish reading and free resources

Usages of line iterators in the sphinxbase were also updated to comply with the interface modifications. These changes were necessary to enable SphinxTrain training on audio files of unlimited size and the lineiter is meant to be central input interface in the SphinxTrain in the future.

Work on the examination of memory issues is ongoing and will be followed by the implementation of memory-optimized Baum-Welch algorithm as described at https://cmusphinx.github.io/wiki/longaudiotraining as well as finding other ways to reduce unreasonable memory demands of current SphinxTrain version.