A Two-Stage Approach for Fine-Grained Visual Recognition via Confidence Ranking and Fusion

Kangbo SUN
Jie ZHU

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E103-D    No.12    pp.2693-2700
Publication Date: 2020/12/01
Publicized: 2020/09/11
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2020EDP7024
Type of Manuscript: PAPER
Category: Image Recognition, Computer Vision
Keyword: 
fine-grained,  object location,  attention,  bilinear pooling,  deep learning,  

Full Text: PDF(987KB)>>
Buy this Article



Summary: 
Location and feature representation of object's parts play key roles in fine-grained visual recognition. To promote the final recognition accuracy without any bounding boxes/part annotations, many studies adopt object location networks to propose bounding boxes/part annotations with only category labels, and then crop the images into partial images to help the classification network make the final decision. In our work, to propose more informative partial images and effectively extract discriminative features from the original and partial images, we propose a two-stage approach that can fuse the original features and partial features by evaluating and ranking the information of partial images. Experimental results show that our proposed approach achieves excellent performance on two benchmark datasets, which demonstrates its effectiveness.


open access publishing via