Training of CNN with Heterogeneous Learning for Multiple Pedestrian Attributes Recognition Using Rarity Rate

Hiroshi FUKUI  Takayoshi YAMASHITA  Yuji YAMAUCHI  Hironobu FUJIYOSHI  Hiroshi MURASE  

IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.5   pp.1222-1231
Publication Date: 2018/05/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017MVP0001
Type of Manuscript: Special Section PAPER (Special Section on Machine Vision and its Applications)
Category: Machine Vision and its Applications
pedestrian attributes recognition,  heterogeneous learning,  rarity rate,  

Full Text: PDF(3.9MB)>>
Buy this Article

Pedestrian attribute information is important function for an advanced driver assistance system (ADAS). Pedestrian attributes such as body pose, face orientation and open umbrella indicate the intended action or state of the pedestrian. Generally, this information is recognized using independent classifiers for each task. Performing all of these separate tasks is too time-consuming at the testing stage. In addition, the processing time increases with increasing number of tasks. To address this problem, multi-task learning or heterogeneous learning is performed to train a single classifier to perform multiple tasks. In particular, heterogeneous learning is able to simultaneously train a classifier to perform regression and recognition tasks, which reduces both training and testing time. However, heterogeneous learning tends to result in a lower accuracy rate for classes with few training samples. In this paper, we propose a method to improve the performance of heterogeneous learning for such classes. We introduce a rarity rate based on the importance and class probability of each task. The appropriate rarity rate is assigned to each training sample. Thus, the samples in a mini-batch for training a deep convolutional neural network are augmented according to this rarity rate to focus on the classes with a few samples. Our heterogeneous learning approach with the rarity rate performs pedestrian attribute recognition better, especially for classes representing few training samples.