TY - JOUR
T1 - A novel software defect prediction based on atomic class-association rule mining
AU - Shao, Yuanxun
AU - Liu, Bin
AU - Wang, Shihai
AU - Li, Guoqi
N1 - Publisher Copyright:
© 2018 Elsevier Ltd
PY - 2018/12/30
Y1 - 2018/12/30
N2 - To ensure the rational allocation of software testing resources and reduce costs, software defect prediction has drawn notable attention to many “white-box” and “black-box” classification algorithms. Although there have been lots of studies on using software product metrics to identify defect-prone modules, defect prediction algorithms are still worth exploring. For instance, it is not easy to directly implement the Apriori algorithm to classify defect-prone modules across a skewed dataset. Therefore, we propose a novel supervised approach for software defect prediction based on atomic class-association rule mining (ACAR). It holds the characteristics of only one feature of the antecedent and a unique class label of the consequent, which is a specific kind of association rules that explores the relationship between attributes and categories. It holds the characteristics of only one feature of the antecedent and a unique class label of the consequent, which is a specific kind of association rules that explores the relationship between attributes and categories. Such association patterns can provide meaningful knowledge that can be easily understood by software engineers. A new software defect prediction model infrastructure based on association rules is employed to improve the prediction of defect-prone modules, which is divided into data preprocessing, rule model building and performance evaluation. Moreover, ACAR can achieve a satisfactory classification performance compared with other seven benchmark learners (the extension of classification based on associations (CBA2), Support Vector Machine, Naive Bayesian, Decision Tree, OneR, K-nearest Neighbors and RIPPER) on NASA MDP and PROMISE datasets. In light of software defect associative prediction, a comparative experiment between ACAR and CBA2 is discussed in details. It is demonstrated that ACAR is better than CBA2 in terms of AUC, G-mean, Balance, and understandability. In addition, the average AUC of ACAR is increased by 2.9% compared with CBA2, which can reach 81.1%.
AB - To ensure the rational allocation of software testing resources and reduce costs, software defect prediction has drawn notable attention to many “white-box” and “black-box” classification algorithms. Although there have been lots of studies on using software product metrics to identify defect-prone modules, defect prediction algorithms are still worth exploring. For instance, it is not easy to directly implement the Apriori algorithm to classify defect-prone modules across a skewed dataset. Therefore, we propose a novel supervised approach for software defect prediction based on atomic class-association rule mining (ACAR). It holds the characteristics of only one feature of the antecedent and a unique class label of the consequent, which is a specific kind of association rules that explores the relationship between attributes and categories. It holds the characteristics of only one feature of the antecedent and a unique class label of the consequent, which is a specific kind of association rules that explores the relationship between attributes and categories. Such association patterns can provide meaningful knowledge that can be easily understood by software engineers. A new software defect prediction model infrastructure based on association rules is employed to improve the prediction of defect-prone modules, which is divided into data preprocessing, rule model building and performance evaluation. Moreover, ACAR can achieve a satisfactory classification performance compared with other seven benchmark learners (the extension of classification based on associations (CBA2), Support Vector Machine, Naive Bayesian, Decision Tree, OneR, K-nearest Neighbors and RIPPER) on NASA MDP and PROMISE datasets. In light of software defect associative prediction, a comparative experiment between ACAR and CBA2 is discussed in details. It is demonstrated that ACAR is better than CBA2 in terms of AUC, G-mean, Balance, and understandability. In addition, the average AUC of ACAR is increased by 2.9% compared with CBA2, which can reach 81.1%.
KW - Apriori
KW - Association rules
KW - Data mining
KW - Machine learning
KW - Software defect prediction
UR - https://www.scopus.com/pages/publications/85050891830
U2 - 10.1016/j.eswa.2018.07.042
DO - 10.1016/j.eswa.2018.07.042
M3 - 文章
AN - SCOPUS:85050891830
SN - 0957-4174
VL - 114
SP - 237
EP - 254
JO - Expert Systems with Applications
JF - Expert Systems with Applications
ER -