Selective Pseudo-Labeling Based Subspace Learning for Cross-Project Defect Prediction

Ying SUN  Xiao-Yuan JING  Fei WU  Yanfei SUN  

IEICE TRANSACTIONS on Information and Systems   Vol.E103-D   No.9   pp.2003-2006
Publication Date: 2020/09/01
Publicized: 2020/06/10
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2020EDL8034
Type of Manuscript: LETTER
Category: Software Engineering
cross-project defect prediction,  pseudo-labeling,  subspace learning,  

Full Text: PDF>>
Buy this Article

Cross-project defect prediction (CPDP) is a research hot recently, which utilizes the data form existing source project to construct prediction model and predicts the defect-prone of software instances from target project. However, it is challenging in bridging the distribution difference between different projects. To minimize the data distribution differences between different projects and predict unlabeled target instances, we present a novel approach called selective pseudo-labeling based subspace learning (SPSL). SPSL learns a common subspace by using both labeled source instances and pseudo-labeled target instances. The accuracy of pseudo-labeling is promoted by iterative selective pseudo-labeling strategy. The pseudo-labeled instances from target project are iteratively updated by selecting the instances with high confidence from two pseudo-labeling technologies. Experiments are conducted on AEEEM dataset and the results show that SPSL is effective for CPDP.