Cross-Pose Face Recognition – A Virtual View Generation Approach Using Clustering Based LVTM

Xi LI  Tomokazu TAKAHASHI  Daisuke DEGUCHI  Ichiro IDE  Hiroshi MURASE  

IEICE TRANSACTIONS on Information and Systems   Vol.E96-D   No.3   pp.531-537
Publication Date: 2013/03/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E96.D.531
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Face Perception and Recognition)
Category: Face Perception and Recognition
face recognition,  pose invariant,  clustering,  local view transition model,  

Full Text: PDF>>
Buy this Article

This paper presents an approach for cross-pose face recognition by virtual view generation using an appearance clustering based local view transition model. Previously, the traditional global pattern based view transition model (VTM) method was extended to its local version called LVTM, which learns the linear transformation of pixel values between frontal and non-frontal image pairs from training images using partial image in a small region for each location, instead of transforming the entire image pattern. In this paper, we show that the accuracy of the appearance transition model and the recognition rate can be further improved by better exploiting the inherent linear relationship between frontal-nonfrontal face image patch pairs. This is achieved based on the observation that variations in appearance caused by pose are closely related to the corresponding 3D structure and intuitively frontal-nonfrontal patch pairs from more similar local 3D face structures should have a stronger linear relationship. Thus for each specific location, instead of learning a common transformation as in the LVTM, the corresponding local patches are first clustered based on an appearance similarity distance metric and then the transition models are learned separately for each cluster. In the testing stage, each local patch for the input non-frontal probe image is transformed using the learned local view transition model corresponding to the most visually similar cluster. The experimental results on a real-world face dataset demonstrated the superiority of the proposed method in terms of recognition rate.