Probability Distribution of Time-Series of Speech Spectral Components

Rajkishore PRASAD  Hiroshi SARUWATARI  Kiyohiro SHIKANO  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E87-A   No.3   pp.584-597
Publication Date: 2004/03/01
Online ISSN: 
Print ISSN: 0916-8508
Type of Manuscript: Special Section PAPER (Special Section on Applications and Implementations of Digital Signal Processing)
Category: Audio/Speech Coding
probabilistic modeling,  spectral components of speech,  microphone-array,  GGD,  FDICA,  

Full Text: PDF>>
Buy this Article

This paper deals with the statistical modeling of a Time-Frequency Series of Speech (TFSS), obtained by Short-Time Fourier Transform (STFT) analysis of the speech signal picked up by a linear microphone array with two elements. We have attempted to find closer match between the distribution of the TFSS and theoretical distributions like Laplacian Distribution (LD), Gaussian Distribution (GD) and Generalized Gaussian Distribution (GGD) with parameters estimated from the TFSS data. It has been found that GGD provides the best models for real part, imaginary part and polar magnitudes of the time-series of the spectral components. The distribution of the polar magnitude is closer to LD than that of the real and imaginary parts. The distributions of the real and imaginary parts of TFSS correspond to strongly LD. The phase of the TFSS has been found uniformly distributed. The use of GGD based model as PDF in the fixed-point Frequency Domain Independent Component Analysis (FDICA) provides better separation performance and improves convergence speed significantly.