Automatic Vocabulary Adaptation Based on Semantic and Acoustic Similarities

Shoko YAMAHATA  Yoshikazu YAMAGUCHI  Atsunori OGAWA  Hirokazu MASATAKI  Osamu YOSHIOKA  Satoshi TAKAHASHI  

IEICE TRANSACTIONS on Information and Systems   Vol.E97-D   No.6   pp.1488-1496
Publication Date: 2014/06/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E97.D.1488
Type of Manuscript: Special Section PAPER (Special Section on Advances in Modeling for Real-world Speech Information Processing and its Application)
Category: Speech Recognition
out-of-vocabulary,  vocabulary adaptation,  semantic similarity,  acoustic similarity,  

Full Text: PDF>>
Buy this Article

Recognition errors caused by out-of-vocabulary (OOV) words lead critical problems when developing spoken language understanding systems based on automatic speech recognition technology. And automatic vocabulary adaptation is an essential technique to solve these problems. In this paper, we propose a novel and effective automatic vocabulary adaptation method. Our method selects OOV words from relevant documents using combined scores of semantic and acoustic similarities. Using this combined score that reflects both semantic and acoustic aspects, only necessary OOV words can be selected without registering redundant words. In addition, our method estimates probabilities of OOV words using semantic similarity and a class-based N-gram language model. These probabilities will be appropriate since they are estimated by considering both frequencies of OOV words in target speech data and the stable class N-gram probabilities. Experimental results show that our method improves OOV selection accuracy and recognition accuracy of newly registered words in comparison with conventional methods.