Background: Automatic program repair aims to reduce costs associated with defect repair. The detection and characterization of common bug-fix patterns in software repositories play an important role in advancing this field.
Aim: In this paper, we characterize the occurrence of known bug-fix patterns in Java repositories at an unprecedented large scale. Furthermore, we propose a novel automatic technique for unveiling frequent and isolated repair actions corresponding to realistic bug fixes in Java.
Method: The study was conducted for Java GitHub projects organized in two distinct data sets. The first data set (Boa) contains more than 4 million bug-fix commits from 101,471 projects. The second data set (Defects4J) contains 369 real bug fixes from five open-source projects.
Results: We characterized the prevalence of the five most common bug-fix patterns (identified in the work of Pan et al.) in those bug fixes. The combined results showed direct evidence that developers often forget to add IF preconditions in the code.
Conclusion: We discover a total of 155 repair actions from Defects4J patches and discuss 10 pervasive repair actions that occur across all analyzed Java projects. Moreover, the overall Precision and Recall values for the clustering approach were 0.62 and 0.64, respectively.