Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition

Makoto SAKAI  Norihide KITAOKA  Kazuya TAKEDA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E93-D   No.5   pp.1244-1252
Publication Date: 2010/05/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E93.D.1244
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
Keyword: 
speech recognition,  feature extraction,  multidimensional signal processing,  

Full Text: PDF(214.2KB)>>
Buy this Article




Summary: 
To improve speech recognition performance, feature transformation based on discriminant analysis has been widely used to reduce the redundant dimensions of acoustic features. Linear discriminant analysis (LDA) and heteroscedastic discriminant analysis (HDA) are often used for this purpose, and a generalization method for LDA and HDA, called power LDA (PLDA), has been proposed. However, these methods may result in an unexpected dimensionality reduction for multimodal data. It is important to preserve the local structure of the data when reducing the dimensionality of multimodal data. In this paper we introduce two methods, locality-preserving HDA and locality-preserving PLDA, to reduce dimensionality of multimodal data appropriately. We also propose an approximate calculation scheme to calculate sub-optimal projections rapidly. Experimental results show that the locality-preserving methods yield better performance than the traditional ones in speech recognition.