Incremental Unsupervised-Learning of Appearance Manifold with View-Dependent Covariance Matrix for Face Recognition from Video Sequences

Lina  Tomokazu TAKAHASHI  Ichiro IDE  Hiroshi MURASE  

IEICE TRANSACTIONS on Information and Systems   Vol.E92-D    No.4    pp.642-652
Publication Date: 2009/04/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E92.D.642
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Pattern Recognition
appearance manifold,  view-dependent covariance matrix,  incremental learning,  video-based face recognition,  eigenspace,  

Full Text: PDF>>
Buy this Article

We propose an appearance manifold with view-dependent covariance matrix for face recognition from video sequences in two learning frameworks: the supervised-learning and the incremental unsupervised-learning. The advantages of this method are, first, the appearance manifold with view-dependent covariance matrix model is robust to pose changes and is also noise invariant, since the embedded covariance matrices are calculated based on their poses in order to learn the samples' distributions along the manifold. Moreover, the proposed incremental unsupervised-learning framework is more realistic for real-world face recognition applications. It is obvious that it is difficult to collect large amounts of face sequences under complete poses (from left sideview to right sideview) for training. Here, an incremental unsupervised-learning framework allows us to train the system with the available initial sequences, and later update the system's knowledge incrementally every time an unlabelled sequence is input. In addition, we also integrate the appearance manifold with view-dependent covariance matrix model with a pose estimation system for improving the classification accuracy and easily detecting sequences with overlapped poses for merging process in the incremental unsupervised-learning framework. The experimental results showed that, in both frameworks, the proposed appearance manifold with view-dependent covariance matrix method could recognize faces from video sequences accurately.