Discriminating Semantic Visual Words for Scene Classification

Shuoyan LIU  De XU  Songhe FENG  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E93-D   No.6   pp.1580-1588
Publication Date: 2010/06/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E93.D.1580
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Pattern Recognition
Keyword: 
scene classification,  information bottleneck,  Gaussian mixture modeling,  semantic visual words,  semantic interpretation,  

Full Text: PDF(446.8KB)
>>Buy this Article


Summary: 
Bag-of-Visual-Words representation has recently become popular for scene classification. However, learning the visual words in an unsupervised manner suffers from the problem when faced these patches with similar appearances corresponding to distinct semantic concepts. This paper proposes a novel supervised learning framework, which aims at taking full advantage of label information to address the problem. Specifically, the Gaussian Mixture Modeling (GMM) is firstly applied to obtain "semantic interpretation" of patches using scene labels. Each scene induces a probability density on the low-level visual features space, and patches are represented as vectors of posterior scene semantic concepts probabilities. And then the Information Bottleneck (IB) algorithm is introduce to cluster the patches into "visual words" via a supervised manner, from the perspective of semantic interpretations. Such operation can maximize the semantic information of the visual words. Once obtained the visual words, the appearing frequency of the corresponding visual words in a given image forms a histogram, which can be subsequently used in the scene categorization task via the Support Vector Machine (SVM) classifier. Experiments on a challenging dataset show that the proposed visual words better perform scene classification task than most existing methods.