Statistical Debugging and Automated Program Failure Triage
- Statistical Debugging and Automated Program Failure Triage
- Liu, Chao
- Issue Date
- software engineering
- programming languages
- Recent years have seen great advances in software engineering and programming languages, and more and more time is devoted to extensive testing and exhaustive debugging. Unfortunately, software is still far from bug-free, even for those deployed. Static analysis is a quality approach to eliminating numerous bugs, but its conservative nature of analysis unavoidably constrains its capacity. Dynamic analysis, on the other hand, utilizes program runtime execution data, and automatically infers about likely problems with the programs, which complements static approaches in ensuring program quality. This thesis describes three dynamic techniques that leverage program runtime data to improve software quality. First, we present a statistical debugging algorithm, called Sober, which automatically localizes software faults without any prior knowledge of the program semantics. This statistical debugging alleviates the high cost associated with human efforts in manual debugging. Featuring a similar rationale to hypothesis testing, Sober quantifies the fault relevance of each predicate in a principled way. We systematically evaluate Sober under the same setting as previous studies, and compare Sober with other seven algorithms. The result clearly demonstrates the effectiveness: Sober could help developers locate 68 out of the 130 faults in the Siemens suite by examining no more than 10% of the code, whereas the Cause Transition approach proposed by Holger et al. and the statistical approach by Liblit et al. locate 34 and 52 faults, respectively. Moreover, the effectiveness of Sober is also evaluated in an .imperfect world., where the test suite is either inadequate or only partially labeled. Our experiments indicate that Sober could achieve competitive quality under these harsh circumstances. Finally, four case studies with flex-2.4.7, grep-2.2, gzip-1.2.3, and bc-1.06 are reported, which shed light on the applicability of Sober on reasonably large programs. Second, we discuss automated program failure triage, which is a closely related problem with automated debugging. Recent software systems usually feature an automated failure reporting component, with which a huge number of failures are collected from software end-users. In order to effectively leverage the valuable program failure data, the collected failures need to be first triaged, i.e., to locate the most severe failures, and to assign them to appropriate developers. Lying in the center of failure triage is failure indexing, which tries to group failures due to the fault together. Previous studies index program failures based on the trace similarity by hypothesizing that similar failing traces imply the same fault. But because a fault can be triggered in a number of different ways, failing traces due to the same fault can be quite different. In this thesis, we propose a statistical debugging-based approach to program failure triage, called R-Proximity, which better indexes failures and facilitates failure assignment. Two detailed case studies with grep-2.2 and gzip-1.2.3 are provided, which demonstrate the claimed advantages. Finally, we describe a program dynamic slicing-based approach to failure indexing, which complements R-Proximity. According to our evaluation, R-Proximity is a quality failure indexing tool, but its effectiveness relies on a sufficient number of correct executions, which may or may not be available in practice. The proposed approach based on dynamic slices does not require any correct executions, and hence perfectly complements R-Proximity. Three case studies with grep, gzip, and flex are performed, which validates the advantages of the proposed approach.
- Type of Resource
- Copyright and License Information
- You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).
Edit Collection Membership