TY - GEN
T1 - Improving Fault Localization and Program Repair with Deep Semantic Features and Transferred Knowledge
AU - Meng, Xiangxin
AU - Wang, Xu
AU - Zhang, Hongyu
AU - Sun, Hailong
AU - Liu, Xudong
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/7/5
Y1 - 2022/7/5
N2 - Automatic software debugging mainly includes two tasks of fault lo-calization and automated program repair. Compared with the traditional spectrum-based and mutation-based methods, deep learning-based methods are proposed to achieve better performance for fault localization. However, the existing methods ignore the deep seman-tic features or only consider simple code representations. They do not leverage the existing bug-related knowledge from large-scale open-source projects either. In addition, existing template-based program repair techniques can incorporate project specific information better than deep-learning approaches. However, they are weak in selecting the fix templates for efficient program repair. In this work, we propose a novel approach called TRANSFER, which lever-ages the deep semantic features and transferred knowledge from open-source data to improve fault localization and program repair. First, we build two large-scale open-source bug datasets and design 11 BiLSTM-based binary classifiers and a BiLSTM-based multi-classifier to learn deep semantic features of statements for fault localization and program repair, respectively. Second, we combine semantic-based, spectrum-based and mutation-based features and use an MLP-based model for fault localization. Third, the semantic-based features are leveraged to rank the fix templates for program repair. Our extensive experiments on widely-used benchmark De-fects4J show that TRANSFER outperforms all baselines in fault localization, and is better than existing deep-learning methods in automated program repair. Compared with the typical template-based work TBar, TRANSFER can correctly repair 6 more bugs (47 in total) on Defects4J.
AB - Automatic software debugging mainly includes two tasks of fault lo-calization and automated program repair. Compared with the traditional spectrum-based and mutation-based methods, deep learning-based methods are proposed to achieve better performance for fault localization. However, the existing methods ignore the deep seman-tic features or only consider simple code representations. They do not leverage the existing bug-related knowledge from large-scale open-source projects either. In addition, existing template-based program repair techniques can incorporate project specific information better than deep-learning approaches. However, they are weak in selecting the fix templates for efficient program repair. In this work, we propose a novel approach called TRANSFER, which lever-ages the deep semantic features and transferred knowledge from open-source data to improve fault localization and program repair. First, we build two large-scale open-source bug datasets and design 11 BiLSTM-based binary classifiers and a BiLSTM-based multi-classifier to learn deep semantic features of statements for fault localization and program repair, respectively. Second, we combine semantic-based, spectrum-based and mutation-based features and use an MLP-based model for fault localization. Third, the semantic-based features are leveraged to rank the fix templates for program repair. Our extensive experiments on widely-used benchmark De-fects4J show that TRANSFER outperforms all baselines in fault localization, and is better than existing deep-learning methods in automated program repair. Compared with the typical template-based work TBar, TRANSFER can correctly repair 6 more bugs (47 in total) on Defects4J.
KW - Fault localization
KW - neural networks
KW - program repair
KW - software debugging
KW - transfer learning
UR - https://www.scopus.com/pages/publications/85133551961
U2 - 10.1145/3510003.3510147
DO - 10.1145/3510003.3510147
M3 - 会议稿件
AN - SCOPUS:85133551961
T3 - Proceedings - International Conference on Software Engineering
SP - 1169
EP - 1180
BT - Proceedings - 2022 ACM/IEEE 44th International Conference on Software Engineering, ICSE 2022
PB - IEEE Computer Society
T2 - 44th ACM/IEEE International Conference on Software Engineering, ICSE 2022
Y2 - 22 May 2022 through 27 May 2022
ER -