次へ: Polynomial Approximation 上へ: raut04ASJ03 戻る: Introduction

Model of the Environment

An acoustical model demonstrating the effect of additive noise

and channel filtering

over a clean speech signal

is shown in Figure 1.

**図 1:** Model of the acoustical environment.
$\includegraphics[width=.7\linewidth]{eps/envmodel.eps}$

The corrupted speech is given by:

$\displaystyle y[m]=x[m]*h[m]+n[m]$

(1)

where

is sample number. In power spectral domain:

$\displaystyle \vert Y(f)\vert^2\approx \vert X(f)\vert^2\vert H(f)\vert^2+\vert N(f)\vert^2$			(2)
$\displaystyle \Rightarrow \ln\vert Y(f)\vert^2\approx \ln\vert X(f)\vert^2+\ln\vert H(f)\vert^2+{}$
$\displaystyle {}\ln\Bigg(1+e^{\ln\vert N(f)\vert^2-\ln\vert X(f)\vert^2-\ln\vert H(f)\vert^2}\Bigg)$			(3)
$\displaystyle \Rightarrow y=x+h+\ln(1+e^{n-x-h}) \ \ \ \ \ \ \ \ \ \ \ \ \ \$			(4)

where

and

represent log-spectral energies of clean signal, additive noise, convolutive noise and corrupted signal respectively, for given frequency

Thus, the relationship between speech and noise is non-linear one, as given in Eq.(4). Experiments show that even if noise and clean speech parameters have Gaussian distribution (in log-domain), the corrupted speech parameters do not have Gaussian distribution anymore. However, if parameters have low variances, and in case a number of mixtures of Gaussians are used to model their distributions, the distribution of parameters can be still assumed to be Gaussian without much loss of accuracy and being able to use the same decoder optimized for Gaussian distribution.

次へ: Polynomial Approximation 上へ: raut04ASJ03 戻る: Introduction

平成16年4月23日