TY - GEN
T1 - Predicting Crash Fault Residence via Simplified Deep Forest Based on A Reduced Feature Set
AU - Zhao, Kunsong
AU - Liu, Jin
AU - Xu, Zhou
AU - Li, Li
AU - Yan, Meng
AU - Yu, Jiaojiao
AU - Zhou, Yuxuan
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/5
Y1 - 2021/5
N2 - The software inevitably encounters the crash, which will take developers a large amount of effort to find the fault causing the crash (short for crashing fault). Developing automatic methods to identify the residence of the crashing fault is a crucial activity for software quality assurance. Researchers have proposed methods to predict whether the crashing fault resides in the stack trace based on the features collected from the stack trace and faulty code, aiming at saving the debugging effort for developers. However, previous work usually neglected the feature preprocessing operation towards the crash data and only used traditional classification models. In this paper, we propose a novel crashing fault residence prediction framework, called ConDF, which consists of a consistency based feature subset selection method and a state-of-The-Art deep forest model. More specifically, first, the feature selection method is used to obtain an optimal feature subset and reduce the feature dimension by reserving the representative features. Then, a simplified deep forest model is employed to build the classification model on the reduced feature set. The experiments on seven open source software projects show that our ConDF method performs significantly better than 17 baseline methods on three performance indicators.
AB - The software inevitably encounters the crash, which will take developers a large amount of effort to find the fault causing the crash (short for crashing fault). Developing automatic methods to identify the residence of the crashing fault is a crucial activity for software quality assurance. Researchers have proposed methods to predict whether the crashing fault resides in the stack trace based on the features collected from the stack trace and faulty code, aiming at saving the debugging effort for developers. However, previous work usually neglected the feature preprocessing operation towards the crash data and only used traditional classification models. In this paper, we propose a novel crashing fault residence prediction framework, called ConDF, which consists of a consistency based feature subset selection method and a state-of-The-Art deep forest model. More specifically, first, the feature selection method is used to obtain an optimal feature subset and reduce the feature dimension by reserving the representative features. Then, a simplified deep forest model is employed to build the classification model on the reduced feature set. The experiments on seven open source software projects show that our ConDF method performs significantly better than 17 baseline methods on three performance indicators.
KW - Crash localization
KW - deep forest
KW - feature subset selection
KW - stack trace
UR - https://www.scopus.com/pages/publications/85113229578
U2 - 10.1109/ICPC52881.2021.00031
DO - 10.1109/ICPC52881.2021.00031
M3 - 会议稿件
AN - SCOPUS:85113229578
T3 - IEEE International Conference on Program Comprehension
SP - 242
EP - 252
BT - Proceedings - 2021 IEEE/ACM 29th International Conference on Program Comprehension, ICPC 2021
PB - IEEE Computer Society
T2 - 29th IEEE/ACM International Conference on Program Comprehension, ICPC 2021
Y2 - 20 May 2021 through 21 May 2021
ER -