Modeling Interactions between Low-Level and High-Level Features for Human Action Recognition

Wen ZHOU  Chunheng WANG  Baihua XIAO  Zhong ZHANG  Yunxue SHAO  

IEICE TRANSACTIONS on Information and Systems   Vol.E96-D   No.12   pp.2896-2899
Publication Date: 2013/12/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E96.D.2896
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Image Recognition, Computer Vision
action recognition,  camera position features,  local spatio-temporal features,  interactions,  

Full Text: PDF>>
Buy this Article

Recognizing human action in complex scenes is a challenging problem in computer vision. Some action-unrelated concepts, such as camera position features, could significantly affect the appearance of local spatio-temporal features, and therefore the performance of low-level features based methods degrades. In this letter, we define the action-unrelated concept: the position of camera as high-level features. We observe that they can serve as a prior to local spatio-temporal features for human action recognition. We encode this prior by modeling interactions between spatio-temporal features and camera position features. We infer camera position features from local spatio-temporal features via these interactions. The parameters of this model are estimated by a new max-margin algorithm. We evaluate the proposed method on KTH, IXMAS and Youtube actions datasets. Experimental results show the effectiveness of the proposed method.