Early Stopping Heuristics in Pool-Based Incremental Active Learning for Least-Squares Probabilistic Classifier

Tsubasa KOBAYASHI
Masashi SUGIYAMA

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E95-D    No.8    pp.2065-2073
Publication Date: 2012/08/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E95.D.2065
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Artificial Intelligence, Data Mining
Keyword: 
pool-based incremental active learning,  early stopping,  least-squares probabilistic classifier,  semi-supervised learning,  

Full Text: PDF>>
Buy this Article



Summary: 
The objective of pool-based incremental active learning is to choose a sample to label from a pool of unlabeled samples in an incremental manner so that the generalization error is minimized. In this scenario, the generalization error often hits a minimum in the middle of the incremental active learning procedure and then it starts to increase. In this paper, we address the problem of early labeling stopping in probabilistic classification for minimizing the generalization error and the labeling cost. Among several possible strategies, we propose to stop labeling when the empirical class-posterior approximation error is maximized. Experiments on benchmark datasets demonstrate the usefulness of the proposed strategy.