next up previous
次へ: Introduction

Multi-pitch Detection Algorithm Using Constrained Gaussian Mixture Model and Information Criterion for Simultaneous Speech


Graduate School of Information Science and Technology
The University of Tokyo, Japan
{kameoka,nishi,sagayama}@hil.t.u-tokyo.ac.jp

概要:

In this paper, a co-channel multi-pitch detection algorithm is described. We suggest the importance of this when prosodic information is need to be extracted separately from respective $ F_0$ patterns of concurrent utterances. Though temporal continuity of speech prosody should be considered, we discuss a process done independently on each single frame as the first step. A model of multiple harmonic structures is constructed with a mixture of tied Gaussian mixtures with which a single harmonic structure is modeled. Our algorithm enables to detect both a number of concurrent speakers, and each spectral envelope of underlying harmonic structure based on a maximum likelihood estimation of the model parameters using EM algorithm and an information criterion. It operates without a priori information of $ F_0$ contours and a restriction of a number of speakers, and it also extracts accurate $ F_0$s as continuous values with simple procedures in spectral domain. Experiments showed our algorithm outperformed well-known cepstrum for both speech signals of a single speaker and simultaneous two speakers.




next up previous
次へ: Introduction
平成16年3月25日