next up previous
次へ: Noisy Speech Recogniton Using 上へ: Complex Spectrum Circle Centroid 戻る: Theoretical Properties of CSCC

Finding the Complex Spectrum Circle Centroid

It is obvious that the target signal spectrum $ S(\omega )$ is restored by finding the centroid of the circle on which three or more microphone inputs $ M_i(\omega )$ lie. In the case of $ K=3$, the circle centroid is uniquely determined from three distinct points on the circle. In the case of $ K > 3$ microphone inputs, the circle centroid can be determined as a point of nearly equal distance from observed microphone inputs. We estimate the centroid as a point $ \tilde{S}(\omega)$ by minimizing the variance of $ K$ squared distances from $ M_i(\omega )$, i.e.,

$\displaystyle \tilde{S}(\omega )=\mathop{\rm argmin}_{Z(\omega)} {\rm\mathop{\mathrm{Var}}\nolimits }
\Big[\Vert Z(\omega)-M_i(\omega) \Vert^2\Big]$     (3)

$ K=4,5,\cdots$ where $ Z(\omega)$ is a point on the complex spectrum plane for arbitrary. We can include cases of $ K=1,2,$ or $ 3$ where the minimum variance is 0.

To solve this equation, let $ X$ and $ jY$ be real and imaginary parts of $ Z(\omega)$, i.e., $ Z=X+jY$, and let $ x_i$ and $ jy_i$ be those of $ M_i(\omega) = x_i + j y_i$. Then, we have

$\displaystyle { \mathop{\mathrm{Var}}\nolimits \Big[ \Vert Z - M_i \Vert ^2 \Big] }$
  $\displaystyle =$ $\displaystyle \frac{1}{K} \sum_{i=1}^K
\Vert Z - M_i \Vert ^4
- \Big( \frac{1}{K} \sum_{i=1}^K \Vert Z - M_i \Vert ^2
\Big) ^2$  
  $\displaystyle =$ $\displaystyle \frac{1}{K} \sum_{i=1}^K \left( \left(X-x_i\right)^2 +\left(Y-y_i
\right)^2 \right)^2$  
    $\displaystyle \quad\qquad - \Big( \frac{1}{K} \sum_{i=1}^K \left( \left(X-x_i\right)^2
+\left(Y-y_i \right)^2 \right) \Big) ^2$ (4)

from which its partial differentials in respect to $ X$ and $ Y$ is derived as:
$\displaystyle {
\begin{pmatrix}
\frac{\displaystyle\partial }
{\displaystyle\pa...
...\end{pmatrix}\mathop{\mathrm{Var}}\nolimits \Big[ \Vert Z - M_i \Vert^2 \Big]
}$
$\displaystyle $ $\displaystyle =\!\!$ $\displaystyle 8 \!\!\begin{pmatrix}
\mathop{\mathrm{Var}}\nolimits [\!x_i\!] \!...
...\! \mathop{\mathrm{Cov}}\nolimits [\! y_i \!, \! x_i^2 \!] \!
\end{pmatrix}    $ (5)

denoting the covariance of $ a$ and $ b$ by $ \mathop{\mathrm{Cov}}\nolimits [a,b]$. Letting the lefthand side of the above equation be 0, we obtain a linear equation to obtain the centroid that minimizes the variance of squared distances in Eq. (3):

$\displaystyle \begin{pmatrix}\! X \! [1ex] \! Y \! \end{pmatrix} \!=\! \frac{...
... [y_i,y_i^2] \!\!+\!\! \mathop{\mathrm{Cov}}\nolimits [y_i,x_i^2] \end{pmatrix}$ (6)

whose solutions $ X$ and $ Y$ give the estimated complex spectrum centroid for each frequency as $ \tilde{S}(\omega) = X(\omega)+jY(\omega)$.

Eq. (6) has a solution if the covariance matrix:

$\displaystyle C_{xy}\equiv \frac{1}{2} \begin{pmatrix}\mathop{\mathrm{Var}}\nol...
...m{Cov}}\nolimits [x_i,y_i] & \mathop{\mathrm{Var}}\nolimits [y_i] \end{pmatrix}$ (7)

between $ x_i$ and $ y_j$ is regular. Since its determinant is given by

$\displaystyle \vert C_{xy} \vert =\mathop{\mathrm{Var}}\nolimits [x_i]\mathop{\...
...sqrt{\mathop{\mathrm{Var}}\nolimits [x_i]\mathop{\mathrm{Var}}\nolimits [y_i]}}$ (8)

where $ r_{xy}$ is the correlation coefficient between $ x_i$ and $ y_j$, the solution of Eq. (6) is guaranteed to exist unless $ r_{xy}=1$, i.e., all spectrum points $ M_i(\omega),  i=3, 4, \dots, K$ lie on a line in the complex plane.

Even though $ r_{xy}$ is always guaranteed to be no greater than 1, in numerically bad conditions such as $ r_{xy}>0.99$, we use the center of gravity of $ K$ points $ M_i(\omega )$, i.e., the delay-and-sum solution, instead of the circle centroid here.


next up previous
次へ: Noisy Speech Recogniton Using 上へ: Complex Spectrum Circle Centroid 戻る: Theoretical Properties of CSCC
平成16年9月23日