For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Speech Recognition Based on Fusion of Visual and Auditory Information Using Full-Framse Color Image
Satoru IGAWA Akio OGIHARA Akira SHINTANI Shinobu TAKAMATSU
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Publication Date: 1996/11/25
Print ISSN: 0916-8508
Type of Manuscript: Special Section LETTER (Special Section of Letters Selected from the 1996 IEICE General Conference)
speech recognition, fusion of visual and auditory, sensor fusion, hidden Markov model,
Full Text: PDF>>
We propose a method to fuse auditory information and visual information for accurate speech recognition. This method fuses two kinds of information by using Iinear combination after calculating two kinds of probabilities by HMM for each word. In addition, we use full-frame color image as visual information in order to improve the accuracy of the proposed speech recognition system. We have performed experiments comparing the proposed method with the method using either auditory information or visual information, and confirmed the validity of the proposed method.