Recommending Good First Issues in GitHub OSS Projects

  • Wenxin Xiao
  • , Hao He
  • , Weiwei Xu
  • , Xin Tan
  • , Jinhao Dong
  • , Migahui Zhou*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Attracting and retaining newcomers is vital for the sustainability of an open-source software project. However, it is difficult for new-comers to locate suitable development tasks, while existing 'Good First Issues' (GFI) in GitHub are often insufficient and inappropriate. In this paper, we propose RECGFI, an effective practical approach for the recommendation of good first issues to newcomers, which can be used to relieve maintainers' burden and help newcomers onboard. RECGFI models an issue with features from multiple dimensions (content, background, and dynamics) and uses an XGBoost classifier to generate its probability of being a GFI. To evaluate RECGFI, we collect 53,510 resolved issues among 100 GitHub projects and care-fully restore their historical states to build ground truth datasets. Our evaluation shows that RECGFI can achieve up to 0.853 AUC in the ground truth dataset and outperforms alternative models. Our interpretable analysis of the trained model further reveals in-teresting observations about GFI characteristics. Finally, we report latest issues (without GFI-signaling labels but recommended as GFI by our approach) to project maintainers among which 16 are confirmed as real GFIs and five have been resolved by a newcomer.

Original languageEnglish
Title of host publicationProceedings - 2022 ACM/IEEE 44th International Conference on Software Engineering, ICSE 2022
PublisherIEEE Computer Society
Pages1830-1842
Number of pages13
ISBN (Electronic)9781450392211
DOIs
StatePublished - 5 Jul 2022
Externally publishedYes
Event44th ACM/IEEE International Conference on Software Engineering, ICSE 2022 - Hybrid, Pittsburgh, United States
Duration: 22 May 202227 May 2022

Publication series

NameProceedings - International Conference on Software Engineering
Volume2022-May
ISSN (Print)0270-5257

Conference

Conference44th ACM/IEEE International Conference on Software Engineering, ICSE 2022
Country/TerritoryUnited States
CityHybrid, Pittsburgh
Period22/05/2227/05/22

Keywords

  • good first issues
  • onboarding
  • open-source software

Fingerprint

Dive into the research topics of 'Recommending Good First Issues in GitHub OSS Projects'. Together they form a unique fingerprint.

Cite this