Continuous Speech Recognition Using an On-Line Speaker Adaptation Method Based on Automatic Speaker Clustering


IEICE TRANSACTIONS on Information and Systems   Vol.E86-D    No.3    pp.464-473
Publication Date: 2003/03/01
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Issue on Speech Information Processing)
Category: Speech and Speaker Recognition
speaker adaptation,  speech recognition,  speaker clustering,  MLLR,  MAP,  

Full Text: PDF>>
Buy this Article

This paper evaluates an on-line incremental speaker adaptation method for co-channel conversation including multiple speakers with the assumption that the speaker is unknown and changes frequently. After performing the speaker clustering treatment based on the Vector Quantization (VQ) distortion for every utterance, acoustic models for each cluster are adapted by Maximum Likelihood Linear Regression (MLLR) or Maximum A Posteriori probability (MAP). The performance of continuous speech recognition could be improved. In this paper, to prove the efficiency of the speaker clustering method for improving the performance of continuous speech recognition, the continuous speech recognition experiments with supervised and unsupervised cluster adaptation were conducted, respectively. Finally, evaluation experiments based on other prepared test data were performed on continuous syllable recognition and large vocabulary continuous speech recognition (LVCSR). The efficiency of the speaker adaptation and clustering methods presented in this paper was supported strongly by the experimental results.