Free HTML5 by July 18, 2017

Mining Historical Information to Study Bug Fixes

Software is present in almost all economic activity, and is boosting economic growth from many perspectives. At the same time, like any other man-made artifacts, software suffers from various bugs which lead to incorrect results, deadlocks, or even crashes of the entire system.

Several approaches have been proposed to aid debugging. An interesting recent research direction is automatic program repair, which achieves promising results towards the reduction of costs associated with defect repair in software maintenance. The identification of common bug fix patterns is important to generate program patches automatically.

In this paper, we conduct an empirical study with more than 4 million bug fixing commits distributed among 101,471 Java projects hosted on GitHub. We used a domain-specific programming language called Boa to analyze ultra-large-scale data efficiently. With Boa’s support, we automatically detect the prevalence of the 5 most common bug fix patterns (identified in the work of Pan et al.) in those bug fixing commits.