Skip to main navigation Skip to search Skip to main content

SCPatcher: Mining Crowd Security Discussions to Enrich Secure Coding Practices

  • Ziyou Jiang
  • , Lin Shi
  • , Guowei Yang
  • , Qing Wang*
  • *Corresponding author for this work
  • State Key Laboratory of Intelligent Game
  • CAS - Institute of Software
  • University of Chinese Academy of Sciences
  • University of Queensland

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Secure coding practices (SCPs) have been proposed to guide software developers to write code securely to prevent potential security vulnerabilities. Yet, they are typically one-sentence principles without detailed specifications, e.g., 'Properly free allocated memory upon the completion of functions and at all exit points.', which makes them difficult to follow in practice, especially for software developers who are not yet experienced in secure programming. To address this problem, this paper proposes SCPatcher, an automated approach to enrich secure coding practices by mining crowd security discussions on online knowledge-sharing platforms, such as Stack Overflow. In particular, for each security post, SCPatcher first extracts the area of coding examples and coding explanations with a fix-prompt tuned Large Language Model (LLM) via Prompt Learning. Then, it hierarchically slices the lengthy code into coding examples and summarizes the coding explanations with the areas. Finally, SCPatcher matches the CWE and Public SCP, integrating them with extracted coding examples and explanations to form the SCP specifications, which are the wild SCPs with details, proposed by the developers. To evaluate the performance of SCPatcher, we conduct experiments on 3,907 security posts from Stack Overflow. The experimental results show that SCPatcher outperforms all baselines in extracting the coding examples with 2.73 % MLine on average, as well as coding explanations with 3.97 % F1 on average. Moreover, we apply SCPatcher on 447 new security posts to further evaluate its practicality, and the extracted SCP specifications enrich the public SCPs with 3,074 lines of code and 1,967 sentences.

Original languageEnglish
Title of host publicationProceedings - 2023 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages358-370
Number of pages13
ISBN (Electronic)9798350329964
DOIs
StatePublished - 2023
Event38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023 - Echternach, Luxembourg
Duration: 11 Sep 202315 Sep 2023

Publication series

NameProceedings - 2023 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023

Conference

Conference38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023
Country/TerritoryLuxembourg
CityEchternach
Period11/09/2315/09/23

Keywords

  • Artificial Intelligence
  • Large Language Model
  • Secure Coding Practice
  • Software Security

Fingerprint

Dive into the research topics of 'SCPatcher: Mining Crowd Security Discussions to Enrich Secure Coding Practices'. Together they form a unique fingerprint.

Cite this