Sagayama2004ICSLP-rev3

Complex Spectrum Circle Centroid

for Microphone-Array-Based Noisy Speech Recognition

Shigeki Sagayama, Takashi Okajima, Yutaka Kamamoto, Takuya Nishimoto
Graduate School of Information Science and Technology
The University of Tokyo
Hongo, Bunkyo-ku, Tokyo, Japan

We propose a novel principle based on Complex Spectrum Circle Centroid (CSCC) for restoring complex spectrum of the target signal from multiple microphone input signals in a noisy environment. If noise arrives at multiple microphones with different time delays relative to the target signal, the observed noisy signals lie on a circle in the complex spectrum plane from which the target signal is restored by finding the centroid of the circle. Unlike most of existing methods for noise reduction such as ICA, AMNOR and beamforming, this non-linear operation is applicable to any type of noise including non-stationary, moving, signal-correlated, non-planar, and interfering speakers, without identifying the noise direction and training parameters.

The proposed method was evaluated with speech recognition experiments in simulated noisy environments and was shown to improve the word accuracy close to the clean speech recognition rate of 89.4% in the case of a single spoken noise, and from 0% with one microphone to 60.6% with 8 microphones in the case of 3 interfering speakers. The properties of this new method is further discussed theoretically and experimentally.

概要: