Detection of Tongue Protrusion Gestures from Video

Luis Ricardo SAPAICO  Hamid LAGA  Masayuki NAKAJIMA  

IEICE TRANSACTIONS on Information and Systems   Vol.E94-D   No.8   pp.1671-1682
Publication Date: 2011/08/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E94.D.1671
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Image Recognition, Computer Vision
face gestures,  mouth segmentation,  perceptual user interface,  tongue protrusion,  vision-based systems,  

Full Text: PDF>>
Buy this Article

We propose a system that, using video information, segments the mouth region from a face image and then detects the protrusion of the tongue from inside the oral cavity. Initially, under the assumption that the mouth is closed, we detect both mouth corners. We use a set of specifically oriented Gabor filters for enhancing horizontal features corresponding to the shadow existing between the upper and lower lips. After applying the Hough line detector, the extremes of the line that was found are regarded as the mouth corners. Detection rate for mouth corner localization is 85.33%. These points are then input to a mouth appearance model which fits a mouth contour to the image. By segmenting its bounding box we obtain a mouth template. Next, considering the symmetric nature of the mouth, we divide the template into right and left halves. Thus, our system makes use of three templates. We track the mouth in the following frames using normalized correlation for mouth template matching. Changes happening in the mouth region are directly described by the correlation value, i.e., the appearance of the tongue in the surface of the mouth will cause a decrease in the correlation coefficient through time. These coefficients are used for detecting the tongue protrusion. The right and left tongue protrusion positions will be detected by analyzing similarity changes between the right and left half-mouth templates and the currently tracked ones. Detection rates under the default parameters of our system are 90.20% for the tongue protrusion regardless of the position, and 84.78% for the right and left tongue protrusion positions. Our results demonstrate the feasibility of real-time tongue protrusion detection in vision-based systems and motivates further investigating the usage of this new modality in human-computer communication.