For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Variable Selection Linear Regression for Robust Speech Recognition
Yu TSAO Ting-Yao HU Sakriani SAKTI Satoshi NAKAMURA Lin-shan LEE
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2014/06/01
Online ISSN: 1745-1361
Type of Manuscript: Special Section PAPER (Special Section on Advances in Modeling for Real-world Speech Information Processing and its Application)
Category: Speech Recognition
variable selection, linear regression, MLLR, fMLLR, model space adaptation, feature space adaptation,
Full Text: PDF(1MB)
>>Buy this Article
This study proposes a variable selection linear regression (VSLR) adaptation framework to improve the accuracy of automatic speech recognition (ASR) with only limited and unlabeled adaptation data. The proposed framework can be divided into three phases. The first phase prepares multiple variable subsets by applying a ranking filter to the original regression variable set. The second phase determines the best variable subset based on a pre-determined performance evaluation criterion and computes a linear regression (LR) mapping function based on the determined subset. The third phase performs adaptation in either model or feature spaces. The three phases can select the optimal components and remove redundancies in the LR mapping function effectively and thus enable VSLR to provide satisfactory adaptation performance even with a very limited number of adaptation statistics. We formulate model space VSLR and feature space VSLR by integrating the VS techniques into the conventional LR adaptation systems. Experimental results on the Aurora-4 task show that model space VSLR and feature space VSLR, respectively, outperform standard maximum likelihood linear regression (MLLR) and feature space MLLR (fMLLR) and their extensions, with notable word error rate (WER) reductions in a per-utterance unsupervised adaptation manner.