Selection of Optimum Vocabulary and Dialog Strategy for Noise-Robust Spoken Dialog Systems

Akinori ITO  Takanobu OBA  Takashi KONASHI  Motoyuki SUZUKI  Shozo MAKINO  

IEICE TRANSACTIONS on Information and Systems   Vol.E91-D   No.3   pp.538-548
Publication Date: 2008/03/01
Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e91-d.3.538
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Robust Speech Processing in Realistic Environments)
Category: ASR System Architecture
spoken dialog system,  noisy environment,  dialog strategy,  neural network,  speech recognition,  

Full Text: PDF(809.4KB)
>>Buy this Article

Speech recognition in a noisy environment is one of the hottest topics in the speech recognition research. Noise-tolerant acoustic models or noise reduction techniques are often used to improve recognition accuracy. In this paper, we propose a method to improve accuracy of spoken dialog system from a language model point of view. In the proposed method, the dialog system automatically changes its language model and dialog strategy according to the estimated recognition accuracy in a noisy environment in order to keep the performance of the system high. In a noise-free environment, the system accepts any utterance from a user. On the other hand, the system restricts its grammar and vocabulary in a noisy environment. To realize this strategy, we investigated a method to avoid the user's out-of-grammar utterances through an instruction given by the system to a user. Furthermore, we developed a method to estimate recognition accuracy from features extracted from noise signals. Finally, we realized a proposed dialog system according to these investigations.