Lexicon-Based Local Representation for Text-Dependent Speaker Verification

Hanxu YOU  Wei LI  Lianqiang LI  Jie ZHU  

IEICE TRANSACTIONS on Information and Systems   Vol.E100-D   No.3   pp.587-589
Publication Date: 2017/03/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2016EDL8182
Type of Manuscript: LETTER
Category: Speech and Hearing
i-vector,  L-vector,  text-dependent speaker verification,  cosine distance kernel,  

Full Text: PDF(387.9KB)>>
Buy this Article

A text-dependent i-vector extraction scheme and a lexicon-based binary vector (L-vector) representation are proposed to improve the performance of text-dependent speaker verification. I-vector and L-vector are used to represent the utterances for enrollment and test. An improved cosine distance kernel is constructed by combining i-vector and L-vector together and is used to distinguish both speaker identity and lexical (or text) diversity with back-end support vector machine (SVM). Experiments are conducted on RSR 2015 Corpus part 1 and part 2, the results indicate that at most 30% improvement can be obtained compared with traditional i-vector baseline.