An Empirical Study of Regression Bug Chains in Linux

Research output: Contribution to journalArticlepeer-review

Abstract

Regression bugs are a type of bugs that cause a feature of software that worked correctly but stop working after a certain software commit. This paper presents a systematic study of regression bug chains, an important but unexplored phenomenon of regression bugs. Our paper is based on the observation that a commit c1, which fixes a regression bug b1, may accidentally introduce another regression bug b2. Likewise, commit c2 repairing b2 may cause another regression bug b3, resulting in a bug chain, i.e., b1\rightarrow c1\rightarrow b2\rightarrow c2\rightarrow b3. We have conducted a large-scale study by collecting 1579 regression bugs and 2630 commits from 57 Linux versions (from 2.6.12 to 4.9). The relationships between regression bugs and commits are modeled as a directed bipartite network. Our major contributions and findings are fourfold: 1) a novel concept of regression bug chains and their formulation; 2) compared to an isolated regression bug, a bug on a regression bug chain is much more difficult to repair, costing 2.4× more fixing time, involving 1.3× more developers and 2.8× more comments; 3) 85.8% of bugs on the chains in Linux reside in Drivers, ACPI, Platform Specific/Hardware, and Power Management; and 4) 83% of the chains affect only a single Linux subsystem, while 68% of the chains propagate across Linux versions.

Original languageEnglish
Article number8673578
Pages (from-to)558-570
Number of pages13
JournalIEEE Transactions on Reliability
Volume69
Issue number2
DOIs
StatePublished - Jun 2020

Keywords

  • Bipartite network
  • Bug-fixing commit (BFC)
  • Bug-introducing commit (BIC)
  • Linux
  • Regression bug
  • Regression bug chain (RBC)

Fingerprint

Dive into the research topics of 'An Empirical Study of Regression Bug Chains in Linux'. Together they form a unique fingerprint.

Cite this