Computationally Efficient Class-Prior Estimation under Class Balance Change Using Energy Distance

Hideko KAWAKUBO  Marthinus Christoffel DU PLESSIS  Masashi SUGIYAMA  

IEICE TRANSACTIONS on Information and Systems   Vol.E99-D   No.1   pp.176-186
Publication Date: 2016/01/01
Publicized: 2015/10/06
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2015EDP7212
Type of Manuscript: PAPER
Category: Artificial Intelligence, Data Mining
class balance change,  class-prior estimation,  energy distance,  

Full Text: PDF>>
Buy this Article

In many real-world classification problems, the class balance often changes between training and test datasets, due to sample selection bias or the non-stationarity of the environment. Naive classifier training under such changes of class balance systematically yields a biased solution. It is known that such a systematic bias can be corrected by weighted training according to the test class balance. However, the test class balance is often unknown in practice. In this paper, we consider a semi-supervised learning setup where labeled training samples and unlabeled test samples are available and propose a class balance estimator based on the energy distance. Through experiments, we demonstrate that the proposed method is computationally much more efficient than existing approaches, with comparable accuracy.