Combining LBP and SIFT in Sparse Coding for Categorizing Scene Images

Shuang BAI  Jianjun HOU  Noboru OHNISHI  

IEICE TRANSACTIONS on Information and Systems   Vol.E97-D   No.9   pp.2563-2566
Publication Date: 2014/09/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2014EDL8046
Type of Manuscript: LETTER
Category: Image Recognition, Computer Vision
scene categorization,  local binary pattern,  scale invariant feature transform,  sparse coding,  feature combination,  

Full Text: PDF>>
Buy this Article

Local descriptors, Local Binary Pattern (LBP) and Scale Invariant Feature Transform (SIFT) are widely used in various computer applications. They emphasize different aspects of image contents. In this letter, we propose to combine them in sparse coding for categorizing scene images. First, we regularly extract LBP and SIFT features from training images. Then, corresponding to each feature, a visual word codebook is constructed. The obtained LBP and SIFT codebooks are used to create a two-dimensional table, in which each entry corresponds to an LBP visual word and a SIFT visual word. Given an input image, LBP and SIFT features extracted from the same positions of this image are encoded together based on sparse coding. After that, spatial max pooling is adopted to determine the image representation. Obtained image representations are converted into one-dimensional features and classified by utilizing SVM classifiers. Finally, we conduct extensive experiments on datasets of Scene Categories 8 and MIT 67 Indoor Scene to evaluate the proposed method. Obtained results demonstrate that combining features in the proposed manner is effective for scene categorization.