Analyzing Threads for Shared Memory Consistency
Sura, Zehra Noman
- Analyzing Threads for Shared Memory Consistency
- Sura, Zehra Noman
- Issue Date
- parallel systems
- Languages allowing explicitly parallel, multithreaded programming (e.g. Java and C#) need to specify a memory consistency model to define program behavior. The memory consistency model defines constraints on the order of memory accesses in systems with shared memory. The design of a memory consistency model affects ease of parallel programming as well as system performance. Compiler analysis can be used to mitigate the performance impact of a memory consistency model that imposes strong constraints on shared memory access orders. In this work, we explore the capability of a compiler to analyze what restrictions are imposed by a memory consistency model for the program being compiled. Our compiler analysis targets Java bytecodes. It focuses on two components: delay set analysis and synchronization analysis. Delay set analysis determines the order of shared memory accesses that must be respected within each individual thread of execution in the source program. We describe a simplified analysis algorithm that is applicable to programs with general thread structure (MIMD programs), and has polynomial time worst-case complexity. This algorithm uses synchronization analysis to improve the accuracy of the results. Synchronization analysis determines the order of shared memory accesses already enforced by synchronization in the source program. We describe a dataflow analysis algorithm for synchronization analysis that is efficient to compute, and that improves precision over previously known methods. The analysis techniques described are used in the implementation of a virtual machine that guarantees sequentially consistent execution of Java bytecodes. This implementation is used to show the effectiveness of our analysis algorithms. On many benchmark programs, the performance of programs on our system is close to 100% of the performance of the same programs executing under a relaxed memory model. Specifically, we observe an average slowdown of 10% on an Intel Xeon platform, with slowdowns of 7% or less for 7 out of 10 benchmarks. On an IBM Power3 platform, we observe an average slowdown of 26%, with slowdowns of 7% or less for 8 out of 10 benchmarks.
- Type of Resource
- Copyright and License Information
- You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).
Edit Collection Membership