Web data collection for pronunciation evaluation


(author: Troy)

(status: week 4)

[Project mentor note:  I have been holding these more recent blog posts pending some issues with Adobe Flash security updates which periodically break cross-platform audio upload web browser solutions. We have decided to plan for a fail-over scheme using low-latency HTTP POST multipart/form-data binary Speex uploads to provide backup in case Flash/rtmplite fails again in the future. This might also support most of the mobile devices. Please excuse the delay and rest assured that progress continues and will continue to be announced at such time as we are confident that we won't need to contradict ourselves as browser technology for audio upload continues to develop. --James Salsman]

The data collection website now can provide basic capabilities. Anyone interested, please check out http://talknicer.net/~li-bo/datacollection/login.php and give it a try. If you encounter any problems, please let us know.

Here are my accomplishments from last week:

1) Discussed the project schema design with  the project mentor and created the database with MySQL. The current schema is shown at http://talknicer.net/w/Database_schema.  During the development of the user interface, slight modifications were made to refine the database schema, such as the age field in for the users table: Storing the user's birth date is much better. Other similar changes were made. I learned that good database design comes from practice, not purely imagination.

2) Implemented the two types of user registration pages: one for students and one for exemplar uploaders. To avoid redundant work and allow for fewer constraints on types of users, the registration process involves two steps: one basic registration and one extra information update. For students, only the basic one is mandatory, but the exemplar uploaders have to fill out two separate forms.
3) Added extra supporting functionality for user management, including password reset and mode selection for users with more than one type.
4) Incorporated the audio recorder with the website for recording and uploading to servers.