Block-Based Bag of Words for Robust Face Recognition under Variant Conditions of Facial Expression, Illumination, and Partial Occlusion

Zisheng LI  Jun-ichi IMAI  Masahide KANEKO  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E94-A   No.2   pp.533-541
Publication Date: 2011/02/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E94.A.533
Print ISSN: 0916-8508
Type of Manuscript: Special Section PAPER (Special Section on Image Media Quality)
Category: Processing
Keyword: 
face recognition,  block-based bag of words,  expressions,  illuminations,  occlusion-invariant,  single training sample per person,  

Full Text: PDF>>
Buy this Article




Summary: 
In many real-world face recognition applications, there might be only one training image per person available. Moreover, the test images may vary in facial expressions and illuminations, or may be partially occluded. However, most classical face recognition techniques assume that multiple images per person are available for training, and they are difficult to deal with extreme expressions, illuminations and occlusions. This paper proposes a novel block-based bag of words (BBoW) method to solve those problems. In our approach, a face image is partitioned into multiple blocks, dense SIFT features are then calculated and vector quantized into different visual words on each block respectively. Finally, histograms of codeword distribution on each local block are concatenated to represent the face image. Our method is able to capture local features on each block while maintaining holistic spatial information of different facial components. Without any illumination compensation or image alignment processing, the proposed method achieves excellent face recognition results on AR and XM2VTS databases. Experimental results show that only using one neutral expression frame per person for training, our method can obtain the best performance ever on face images of AR database with extreme expressions, variant illuminations, and partial occlusions. We also test our method on the standard and darkened sets of XM2VTS database, and achieve the average rates of 100% and 96.10% on the standard and darkened sets of XM2VTS database, respectively.