Augmenting Training Samples with a Large Number of Rough Segmentation Datasets

Mitsuru AMBAI  Yuichi YOSHIDA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E94-D   No.10   pp.1880-1888
Publication Date: 2011/10/01
Online ISSN: 1745-1361
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Information-Based Induction Sciences and Machine Learning)
Category: 
Keyword: 
Interactive recognition,  generic object recognition,  

Full Text: PDF(1.6MB)
>>Buy this Article


Summary: 
We revisit the problem with generic object recognition from the point of view of human-computer interaction. While many existing algorithms for generic object recognition first try to detect target objects before features are extracted and classified in processing, our work is motivated by the belief that solving the task of detection by computer is not always necessary in many practical situations, such as those involving mobile recognition systems with touch displays and cameras. It is natural for these systems to ask users to input the segmentation data for targets through their touch displays. Speaking from the perspective of usability, such systems should involve rough segmentation to reduce the user workload. In this situation, different people would provide different segmentation data. Here, an interesting question arises – if multiple training samples are generated from a single image by using various segmentation data created by different people, what would happen to the accuracy of classification? We created “20 wild bird datasets” that had a large number of rough segmentation datasets made by 383 people in an attempt to answer this question. Our experiments revealed two interesting facts: (i) generating multiple training samples from a single image had positive effects on classification accuracies, especially when image features including spatial information were used and (ii) augmenting training samples with artificial segmentation data synthesized with a morphing technique also had slightly positive effects on classification accuracies.