Combining 3D Convolutional Neural Networks with Transfer Learning by Supervised Pre-Training for Facial Micro-Expression Recognition

Ruicong ZHI  Hairui XU  Ming WAN  Tingting LI  

IEICE TRANSACTIONS on Information and Systems   Vol.E102-D   No.5   pp.1054-1064
Publication Date: 2019/05/01
Publicized: 2019/01/29
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018EDP7153
Type of Manuscript: PAPER
Category: Pattern Recognition
facial micro-expression,  3D convolutional neural networks,  transfer learning,  spatiotemporal features,  

Full Text: PDF(1.6MB)>>
Buy this Article

Facial micro-expression is momentary and subtle facial reactions, and it is still challenging to automatically recognize facial micro-expression with high accuracy in practical applications. Extracting spatiotemporal features from facial image sequences is essential for facial micro-expression recognition. In this paper, we employed 3D Convolutional Neural Networks (3D-CNNs) for self-learning feature extraction to represent facial micro-expression effectively, since the 3D-CNNs could well extract the spatiotemporal features from facial image sequences. Moreover, transfer learning was utilized to deal with the problem of insufficient samples in the facial micro-expression database. We primarily pre-trained the 3D-CNNs on normal facial expression database Oulu-CASIA by supervised learning, then the pre-trained model was effectively transferred to the target domain, which was the facial micro-expression recognition task. The proposed method was evaluated on two available facial micro-expression datasets, i.e. CASME II and SMIC-HS. We obtained the overall accuracy of 97.6% on CASME II, and 97.4% on SMIC, which were 3.4% and 1.6% higher than the 3D-CNNs model without transfer learning, respectively. And the experimental results demonstrated that our method achieved superior performance compared to state-of-the-art methods.