Files in this item



application/pdf9702479.pdf (5MB)Restricted to U of Illinois
(no description provided)PDF


Title:Hardware and compiler support for cache coherence in large-scale shared-memory multiprocessors
Author(s):Choi, Lynn
Doctoral Committee Chair(s):Padua, David A.
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Engineering, Electronics and Electrical
Engineering, System Science
Computer Science
Abstract:Reducing memory latency is critical to the performance of large-scale parallel systems. Due to the temporal and spatial locality of memory reference patterns, private caches can eliminate redundant memory accesses and thereby reduce both average memory latency and network traffic. However, maintaining cache coherence for such systems is still a challenge. Hardware directories can be very effective, but are too expensive for large-scale multiprocessors.
As an alternative, compiler-directed techniques (4, 5, 6, 7, 8, 9, 10, 11, 14) can be used to maintain coherence. In this approach, cache coherence is maintained locally without directory hardware, thus avoiding the complexity and overhead associated with hardware directories. Although the performance of such schemes has been demonstrated through simulations, most of the studies assume either perfect compile-time analysis or analytical models without real compiler implementations (1, 3, 9, 10, 12, 13). It is still unknown how effectively the compiler can detect potentially stale references and what kind of performance can be obtained using a real compiler. Also, most of the compiler-directed coherence schemes proposed to date have not addressed the real cost of the required hardware support. For example, many of the schemes require expensive hardware support and assume a cache organization with single-word cache lines.
This dissertation addresses these hardware and compiler implementation issues and investigates the feasibility and performance of the compiler-directed cache coherence approach. We propose a new compiler-directed scheme that can be implemented on a large-scale multiprocessor using off-the-shelf microprocessors. The scheme can be adapted to various cache organizations, including multi-word cache lines and byte-addressable architectures. Several system related issues, including critical sections, inter-thread communication, and task migration also have been addressed. The cost of the required hardware support is minimal and proportional to the cache size. The necessary compiler algorithms, including intra- and interprocedural array data flow analysis, have been developed, and implemented in the Polaris parallelizing compiler, and experimentation results on the Perfect Club benchmarks (2) are discussed.
Issue Date:1996
Rights Information:Copyright 1996 Choi, Lynn
Date Available in IDEALS:2011-05-07
Identifier in Online Catalog:AAI9702479
OCLC Identifier:(UMI)AAI9702479

This item appears in the following Collection(s)

Item Statistics