Nowadays, the accuracy of speech processing systems is strongly affected by acoustic
noise. This is a serious obstacle regarding the demands of modern applications. Therefore, these
systems often need a noise reduction algorithmworking in combination with a precise voice activity
detector (VAD). The computation needed to achieve denoising and speech detection must not
exceed the limitations imposed by real time speech processing systems. This chapter presents a
novel VAD for improving speech detection robustness in noisy environments and the performance
of speech recognition systems in real time applications. The algorithm is based on a Multivariate
Complex Gaussian (MCG) observation model and defines an optimal likelihood ratio test (LRT)
involving Multiple and Correlated Observations (MCO) based on a jointly Gaussian probability
distribution (jGpdf) and a symmetric covariance matrix. The complete derivation of the jGpdf-
LRT for the general case of a symmetric covariance matrix is shown in terms of the Cholesky
decomposition which allows to efficiently compute the VAD decision rule. An extensive analysis
of the proposed methodology for a low dimensional observation model demonstrates: i) the
improved robustness of the proposed approach by means of a clear reduction of the classification
error as the number of observations is increased, and ii) the trade-off between the number of
observations and the detection performance. The proposed strategy is also compared to different
VAD methods including the G.729, AMR and AFE standards, as well as other recently reported
algorithms showing a sustained advantage in speech/non-speech detection accuracy and speech
recognition performance using the Aurora databases.
Keywords: Voice activity detection, generalized complex Gaussian probability distribution function, robust speech
recognition