
For FullText PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.

Empirical Studies of a Kernel Density Estimation Based Naive Bayes Method for Software Defect Prediction
Haijin JI Song HUANG Xuewei LV Yaning WU Yuntian FENG
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E102D
No.1
pp.7584 Publication Date: 2019/01/01
Online ISSN: 17451361
DOI: 10.1587/transinf.2018EDP7177
Type of Manuscript: PAPER Category: Software Engineering Keyword: software defect prediction, naive Bayes, kernel density estimation, software metrics,
Full Text: PDF(2.8MB)>>
Summary:
Software defect prediction (SDP) plays a significant part in allocating testing resources reasonably, reducing testing costs, and ensuring software quality. One of the most widely used algorithms of SDP models is Naive Bayes (NB) because of its simplicity, effectiveness and robustness. In NB, when a data set has continuous or numeric attributes, they are generally assumed to follow normal distributions and incorporate the probability density function of normal distribution into their conditional probabilities estimates. However, after conducting a KolmogorovSmirnov test, we find that the 21 main software metrics follow nonnormal distribution at the 5% significance level. Therefore, this paper proposes an improved NB approach, which estimates the conditional probabilities of NB with kernel density estimation of training data sets, to help improve the prediction accuracy of NB for SDP. To evaluate the proposed method, we carry out experiments on 34 software releases obtained from 10 open source projects provided by PROMISE repository. Four wellknown classification algorithms are included for comparison, namely Naive Bayes, Support Vector Machine, Logistic Regression and Random Tree. The obtained results show that this new method is more successful than the four wellknown classification algorithms in the most software releases.

