An Efficient Lip-Reading Method Robust to Illumination Variations

Jinyoung KIM  Joohun LEE  Katsuhiko SHIRAI  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E85-A    No.9    pp.2164-2168
Publication Date: 2002/09/01
Online ISSN: 
Print ISSN: 0916-8508
Type of Manuscript: LETTER
Category: Speech and Hearing
lip reading,  PCA,  RASTA,  image folding,  

Full Text: PDF>>
Buy this Article

In this paper, for real-time automatic image transform based lip-reading under illumination variations, an efficient (smaller feature data size) and robust (better recognition under different lighting conditions) method is proposed. Image transform based approach obtains a compressed representation of image pixel values of speaker's mouth and is reported to show superior lip-reading performance. However, this approach inevitably produces large feature vectors relevant to lip information to require much computation time for lip-reading even when principal component analysis (PCA) is applied. To reduce the necessary dimension of feature vectors, the proposed method folded the lip image based on its symmetry in a frame image. This method also compensates the unbalanced illumination between the left and the right lip areas. Additionally, to filter out the inter-frame time-domain spectral distortion of each pixel contaminated by illumination noise, our method adapted the hi-pass filtering on the variations of pixel values between consecutive frames. In the experimental results performed on database recorded at various lighting conditions, the proposed lip-folding or/and inter-frame filtering reduced much the necessary number of feature data, principal components in this work, and showed superior recognition rate compared to the conventional method.