|Abstract:||Software is pervasive in our daily lives. Unfortunately, software bugs can severely affect the dependability and security of software systems. Among all types of software bugs, the concurrency bug is one of the most troublesome and important. Concurrency bugs widely exist in concurrent programs. They are difficult to detect and diagnose because of their unique non-determinism. In the real world, concurrency bugs have caused several disasters in the past and are generating increasingly severe problems in recent years with the prevalence of multi-core hardware and concurrent programs. Facing the challenge of concurrency bugs, this thesis proposes effective concurrency bug detection and concurrent program testing approaches based on a comprehensive characteristics study of real-world concurrency bugs.
This thesis makes three main contributions. The first contribution is a comprehensive characteristics study of real-world concurrency bugs. A good understanding of real-world bugs is always the foundation for addressing the software bug problem. This dissertation conducts the first comprehensive empirical study of concurrency bug patterns, manifestation conditions, and fix strategies based on a large number of concurrency bugs sampled from widely used open source C/C++ server/client applications. This characteristics study provides many motivation and guidelines for concurrency bug detection, testing and programming language design.
The second main contribution is the proposal of novel techniques to automatically infer programmers' synchronization intentions and detect important types of concurrency bugs. A fundamental problem in concurrency bug detection is determining what types of interleavings are intended and what are not. This thesis proposes novel techniques to automatically infer two types of important synchronization intentions: single-variable atomicity intentions and multi-variable correlations from source code and program execution. Based on these intention inference approaches, two bug detection tools, AVIO and MUVI, are designed to detect concurrency bugs that are common yet not well addressed: atomicity violation bugs and multi-variable concurrency bugs. Experiments have shown that AVIO and MUVI can accurately detect many bugs that existing techniques cannot detect. Previously unknown bugs in widely used open source concurrent programs are also detected.
The third main contribution is along the lines of exploring concurrent programs' interleaving space and exposing concurrency bugs. This thesis presents a hierarchy of interleaving coverage criteria. This hierarchy includes seven interleaving coverage criteria that are designed based on different concurrency bug models and provides guidance to interleaving space exploration. Guided by the coverage criteria research, a testing framework, CTrigger, is built to expose atomicity violation bugs. CTrigger's testing space, called unserializable interleaving space, is carefully designed to balance its complexity and bug-exposing capability. Within this testing space, CTrigger uses trace analysis to identify feasible and rare unserializable interleavings; it uses low-overhead execution perturbation to exercise these interleavings and effectively expose atomicity violation bugs. Experiments have shown that CTrigger can expose real-world atomicity violation bugs 100--1000 times faster than the common practice stress testing. In addition, CTrigger can reliably repeat the bugs that are exposed once 300 to more than 60,000 times faster than stress testing, which will greatly expedite the software failure diagnosis process.