(Author: Srikanth Ronanki)
(Status: GSoC 2012 Pronunciation Evaluation Week 1)
Last week, I accomplished the following:
- Successfully tested producing phoneme acoustic scores from sphinx3_align using two recognition passes. I was able to use the state segmentation parameter -stsegdir as an argument to the program, to obtain acoustic scores for each frame and thereby for each phoneme as well. But, the output of the program is to be decoded to integer format which I will try to do by the end of next week.
- Last week I wrote a program which converts a list of each phoneme's "neighbors," or most similar other phonemes, provided by the project mentor from the Worldbet phonetic alphabet to CMUbet. But, yesterday, when I compared both files manually, found some of the phones mismatched. So, I re-checked my code and fixed the bug. The corrected program takes a string of phonemes representing an expected utterance as input and produces a sphinx3 recognition grammar consisting of a string of alternatives representing each expected phoneme and all of its neighboring, phonemes for automatic edit distance scoring.
All the programs I have written so far are checked in at http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/branches/speecheval/ronanki using subversion. (Similarly, Troy's code is checked in at http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/branches/speecheval/troy.)
Here is the procedure for using that code to obtain neighboring phonemes of CMUbet from a file which contains a string of phonemes:
- To convert Worldbet phonetic alphabet to CMUbet
- To convert input list of phonemes to neighboring phones
python convert_world2cmu.py
python convert2_ngbphones.py
- Ex: "I had faith in them" (arctic_a0030) - a sentence from arctic database: