Cost-Sensitive and Sparse Ladder Network for Software Defect Prediction

Jing SUN  Yi-mu JI  Shangdong LIU  Fei WU  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E103-D   No.5   pp.1177-1180
Publication Date: 2020/05/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2019EDL8198
Type of Manuscript: LETTER
Category: Software Engineering
Keyword: 
semi-supervised learning,  software defect prediction,  ladder network,  cost-sensitive learning,  sparse auto-encoder,  

Full Text: PDF(428.6KB)>>
Buy this Article




Summary: 
Software defect prediction (SDP) plays a vital role in allocating testing resources reasonably and ensuring software quality. When there are not enough labeled historical modules, considerable semi-supervised SDP methods have been proposed, and these methods utilize limited labeled modules and abundant unlabeled modules simultaneously. Nevertheless, most of them make use of traditional features rather than the powerful deep feature representations. Besides, the cost of the misclassification of the defective modules is higher than that of defect-free ones, and the number of the defective modules for training is small. Taking the above issues into account, we propose a cost-sensitive and sparse ladder network (CSLN) for SDP. We firstly introduce the semi-supervised ladder network to extract the deep feature representations. Besides, we introduce the cost-sensitive learning to set different misclassification costs for defective-prone and defect-free-prone instances to alleviate the class imbalance problem. A sparse constraint is added on the hidden nodes in ladder network when the number of hidden nodes is large, which enables the model to find robust structures of the data. Extensive experiments on the AEEEM dataset show that the CSLN outperforms several state-of-the-art semi-supervised SDP methods.