Food Image Recognition Using Covariance of Convolutional Layer Feature Maps

Atsushi TATSUMA  Masaki AONO  

IEICE TRANSACTIONS on Information and Systems   Vol.E99-D   No.6   pp.1711-1715
Publication Date: 2016/06/01
Publicized: 2016/02/23
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2015EDL8212
Type of Manuscript: LETTER
Category: Image Recognition, Computer Vision
food image recognition,  convolutional neural networks,  covariance descriptor,  pattern recognition,  deep learning,  

Full Text: PDF>>
Buy this Article

Recent studies have obtained superior performance in image recognition tasks by using, as an image representation, the fully connected layer activations of Convolutional Neural Networks (CNN) trained with various kinds of images. However, the CNN representation is not very suitable for fine-grained image recognition tasks involving food image recognition. For improving performance of the CNN representation in food image recognition, we propose a novel image representation that is comprised of the covariances of convolutional layer feature maps. In the experiment on the ETHZ Food-101 dataset, our method achieved 58.65% averaged accuracy, which outperforms the previous methods such as the Bag-of-Visual-Words Histogram, the Improved Fisher Vector, and CNN-SVM.