Model Composition by Lagrange Polynomial Approximation for Robust Speech Recognition in Noisy Environment

Graduate School of Information Science and Technology
The University of Tokyo
7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656 JAPAN
{raut, nishi, sagayama}@hil.t.u-tokyo.ac.jp

This paper presents a technique for estimating HMM model parameters for noisy speech from given clean speech HMM and noise HMM. The model parameters are estimated by approximating the non-linear function governing the relationship between speech and noise, by a Lagrange polynomial, and thus enabling the distribution of corrupted speech parameters to have a closed form. The method is computationally efficient, and the experimental results showed significant improvement in recognition performance of noisy speech with this approach. Typically, word accuracy increased from 9.2% with clean model to 82.8% with the model composed by the proposed method as compared to 45.4% with the model composed by PMC Log-normal approximation, on an isolated word recognition task for exhibition hall noise added at 10 dB SNR.

Model Composition by Lagrange Polynomial Approximation for Robust Speech Recognition in Noisy Environment

概要: