Prosody-Dependent Acoustic Modeling Using Variable-Parameter Hidden Markov Models


Jui-Ting Huang, Po-Sen Huang, Yoonsook Mo, Mark Hasegawa-Johnson, Jennifer Cole, University of Illinois at Urbana-Champaign

As an effort to make prosody useful in spontaneous speech recognition, we adopt a quasi-continuous prosodic annotation and accordingly design a prosody-dependent acoustic model to improve ASR performances. We propose a variable-parameter Hidden Markov Models, modeling the mean vector as a function of the prosody variable through a polynomial regression model. The prosodically-adapted acoustic models are used to re-score the N-best output from a standard ASR, according to the prosody variable assigned by an automatic prosody detector. Experiments on the Buckeye corpus demonstrate the effectiveness of our approach.