Integrating Facial Expression and Body Gesture in Videos for Emotion Recognition

Jingjie YAN
Wenming ZHENG
Minhai XIN
Jingwei YAN

IEICE TRANSACTIONS on Information and Systems   Vol.E97-D    No.3    pp.610-613
Publication Date: 2014/03/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E97.D.610
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Pattern Recognition
bimodal emotion recognition,  Harris plus cuboids spatio-temporal feature (HST),  sparse canonical correlation analysis (SCCA),  

Full Text: PDF(505.8KB)>>
Buy this Article

In this letter, we research the method of using face and gesture image sequences to deal with the video-based bimodal emotion recognition problem, in which both Harris plus cuboids spatio-temporal feature (HST) and sparse canonical correlation analysis (SCCA) fusion method are applied to this end. To efficaciously pick up the spatio-temporal features, we adopt the Harris 3D feature detector proposed by Laptev and Lindeberg to find the points from both face and gesture videos, and then apply the cuboids feature descriptor to extract the facial expression and gesture emotion features [1],[2]. To further extract the common emotion features from both facial expression feature set and gesture feature set, the SCCA method is applied and the extracted emotion features are used for the biomodal emotion classification, where the K-nearest neighbor classifier and the SVM classifier are respectively used for this purpose. We test this method on the biomodal face and body gesture (FABO) database and the experimental results demonstrate the better recognition accuracy compared with other methods.

open access publishing via