TY - GEN
T1 - A Novel Approach for Software Defect Prediction Through Relational Association Rules Based on Cost-Sensitive Learning
AU - Tian, Meng
AU - Wang, Shihai
AU - Wu, Wentao
AU - Xie, Wandong
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Software defect prediction (SDP) can predict software modules with potential defect risks before software testing, thereby optimizing the allocation of testing resources. Relational association rules can characterize the relations between data attributes and reveal relevant patterns in complex data. We propose a software defect prediction model using relational association rules based on cost-sensitive learning (CSLRAR). To address the inherent class-imbalance problem of defect data, CSLRAR employs one-class classification strategy to separately mine relational association rules for the defective class and non-defective class using Apriori. Furthermore, we use all training data to construct a feature relational association rule selection mechanism, which which serves as the basis for defective relational association rules set (RAR+) and non-defective relational association rules set (RAR-) to determine whether the rule is retained. The feature relational association rule selection mechanism can improve the quality of the rules set obtained during the rule generation stage. In addition, we conducted experimental evaluations on nine publicly available datasets from the PROMISE database. By comparing and analyzing five baseline models, it has been proven that CSLRAR is significantly better than the baseline in terms of Balance, MCC, and Gmean.
AB - Software defect prediction (SDP) can predict software modules with potential defect risks before software testing, thereby optimizing the allocation of testing resources. Relational association rules can characterize the relations between data attributes and reveal relevant patterns in complex data. We propose a software defect prediction model using relational association rules based on cost-sensitive learning (CSLRAR). To address the inherent class-imbalance problem of defect data, CSLRAR employs one-class classification strategy to separately mine relational association rules for the defective class and non-defective class using Apriori. Furthermore, we use all training data to construct a feature relational association rule selection mechanism, which which serves as the basis for defective relational association rules set (RAR+) and non-defective relational association rules set (RAR-) to determine whether the rule is retained. The feature relational association rule selection mechanism can improve the quality of the rules set obtained during the rule generation stage. In addition, we conducted experimental evaluations on nine publicly available datasets from the PROMISE database. By comparing and analyzing five baseline models, it has been proven that CSLRAR is significantly better than the baseline in terms of Balance, MCC, and Gmean.
KW - Class imbalance
KW - Data mining
KW - Relational association rule
KW - Software defect prediction
UR - https://www.scopus.com/pages/publications/85209814423
U2 - 10.1109/QRS-C63300.2024.00117
DO - 10.1109/QRS-C63300.2024.00117
M3 - 会议稿件
AN - SCOPUS:85209814423
T3 - Proceedings - 2024 IEEE 24th International Conference on Software Quality, Reliability and Security Companion, QRS-C 2024
SP - 880
EP - 886
BT - Proceedings - 2024 IEEE 24th International Conference on Software Quality, Reliability and Security Companion, QRS-C 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 24th IEEE International Conference on Software Quality, Reliability and Security Companion, QRS-C 2024
Y2 - 1 July 2024 through 5 July 2024
ER -