Based on EM Algorithm and Information Criterion

Graduate School of Information Science and Technology, The
University of Tokyo

7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan

{kameoka,nishi,sagayama}@hil.t.u-tokyo.ac.jp

In this paper, a single channel multi-pitch detection algorithm is
described.
Though the temporal continuity of pattern should be considered, we
discuss a process done independently on each single frame for the first step.
A model of multiple harmonic structures is constructed with a mixture of
constrained Gaussian mixtures with which a single harmonic structure is modeled.
Our algorithm enables to detect both a number of concurrent speakers,
and each spectral envelope of underlying harmonic structure based on a
maximum likelihood estimation
of the model parameters using EM algorithm and an information criterion.
It operates without a priori information of contours and a
restriction of a number of speakers, and extracts
accurate s as continuous values with spectral domain
procedures. Experiments showed our algorithm
outperformed well-known cepstrum for both speech signals of a single
speaker and simultaneous two speakers.

- Introduction
- A Maximum Likelihood Formulation
- Model of Harmonic Structures
- Model Parameter Estimation using EM Algorithm
- Another Interpretation as Clustering

- Multi-pitch Detection Algorithm
- Criterion of Model Selection
- Detection of the number of speakers
- Detection of s and Spectral Envelopes

- Experiments

- Conclusions
- REFERENCES
