An Efficient Concept Drift Detection Method for Streaming Data under Limited Labeling

Youngin KIM  Cheong Hee PARK  

IEICE TRANSACTIONS on Information and Systems   Vol.E100-D   No.10   pp.2537-2546
Publication Date: 2017/10/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017EDP7091
Type of Manuscript: PAPER
Category: Artificial Intelligence, Data Mining
concept drift detection,  limited labeling,  probability estimates,  streaming data,  

Full Text: PDF(1.1MB)
>>Buy this Article | Errata[Uploaded on November 1,2017]

In data stream analysis, detecting the concept drift accurately is important to maintain the classification performance. Most drift detection methods assume that the class labels become available immediately after a data sample arrives. However, it is unrealistic to attempt to acquire all of the labels when processing the data streams, as labeling costs are high and much time is needed. In this paper, we propose a concept drift detection method under the assumption that there is limited access or no access to class labels. The proposed method detects concept drift on unlabeled data streams based on the class label information which is predicted by a classifier or a virtual classifier. Experimental results on synthetic and real streaming data show that the proposed method is competent to detect the concept drift on unlabeled data stream.