Speech Recognition Based on Fusion of Visual and Auditory Information Using Full-Framse Color Image

Satoru IGAWA  Akio OGIHARA  Akira SHINTANI  Shinobu TAKAMATSU  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E79-A   No.11   pp.1836-1840
Publication Date: 1996/11/25
Online ISSN: 
DOI: 
Print ISSN: 0916-8508
Type of Manuscript: Special Section LETTER (Special Section of Letters Selected from the 1996 IEICE General Conference)
Category: 
Keyword: 
speech recognition,  fusion of visual and auditory,  sensor fusion,  hidden Markov model,  

Full Text: PDF>>
Buy this Article




Summary: 
We propose a method to fuse auditory information and visual information for accurate speech recognition. This method fuses two kinds of information by using Iinear combination after calculating two kinds of probabilities by HMM for each word. In addition, we use full-frame color image as visual information in order to improve the accuracy of the proposed speech recognition system. We have performed experiments comparing the proposed method with the method using either auditory information or visual information, and confirmed the validity of the proposed method.