Multiscale Bagging and Its Applications

Hidetoshi SHIMODAIRA  Takafumi KANAMORI  Masayoshi AOKI  Kouta MINE  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E94-D   No.10   pp.1924-1932
Publication Date: 2011/10/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E94.D.1924
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Information-Based Induction Sciences and Machine Learning)
Category: 
Keyword: 
bagging,  active learning,  confidence level,  classification,  

Full Text: PDF>>
Buy this Article




Summary: 
We propose multiscale bagging as a modification of the bagging procedure. In ordinary bagging, the bootstrap resampling is used for generating bootstrap samples. We replace it with the multiscale bootstrap algorithm. In multiscale bagging, the sample size m of bootstrap samples may be altered from the sample size n of learning dataset. For assessing the output of a classifier, we compute bootstrap probability of class label; the frequency of observing a specified class label in the outputs of classifiers learned from bootstrap samples. A scaling-law of bootstrap probability with respect to σ2=n/m has been developed in connection with the geometrical theory. We consider two different ways for using multiscale bagging of classifiers. The first usage is to construct a confidence set of class labels, instead of a single label. The second usage is to find inputs close to decision boundaries in the context of query by bagging for active learning. It turned out, interestingly, that an appropriate choice of m is m =-n, i.e., σ2=-1, for the first usage, and m =∞, i.e., σ2=0, for the second usage.