Using a Kind of Novel Phonotactic Information for SVM Based Speaker Recognition

Xiang ZHANG  Hongbin SUO  Qingwei ZHAO  Yonghong YAN  

IEICE TRANSACTIONS on Information and Systems   Vol.E92-D    No.4    pp.746-749
Publication Date: 2009/04/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E92.D.746
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Speech and Hearing
speaker recognition,  Gaussian mixture model,  universal background model,  support vector machine,  

Full Text: PDF>>
Buy this Article

In this letter, we propose a new approach to SVM based speaker recognition, which utilizes a kind of novel phonotactic information as the feature for SVM modeling. Gaussian mixture models (GMMs) have been proven extremely successful for text-independent speaker recognition. The GMM universal background model (UBM) is a speaker-independent model, each component of which can be considered as modeling some underlying phonetic sound classes. We assume that the utterances from different speakers should get different average posterior probabilities on the same Gaussian component of the UBM, and the supervector composed of the average posterior probabilities on all components of the UBM for each utterance should be discriminative. We use these supervectors as the features for SVM based speaker recognition. Experiment results on a NIST SRE 2006 task show that the proposed approach demonstrates comparable performance with the commonly used systems. Fusion results are also presented.