Files in this item
|(no description provided)|
|Title:||Tunable shared-memory abstractions for distributed-memory systems|
|Author(s):||Totty, Brian Keith|
|Doctoral Committee Chair(s):||Reed, Daniel|
|Department / Program:||Computer Science|
|Degree Granting Institution:||University of Illinois at Urbana-Champaign|
|Abstract:||Distributed memory multiprocessor architectures offer enormous computational power, by exploiting the concurrent execution of many loosely connected processors. Yet, such scalability is not without price. Interface delays and low interconnection bandwidth to the distributed memories make internode memory access inefficient. Furthermore, as processor speeds increase, the performance gap among the layers of the local memory hierarchy increases. To achieve good performance, the programmer and system must carefully manage data to increase intranode and internode memory locality. While attempting to do so, data management must not become such a burden that it prevents the creation of sophisticated programs.
This thesis investigates tunable, data management abstractions called "distributed data structures." Distributed data structures provide a compromise between flexible, yet burdensome, message-passing systems, and inflexible, yet simple, shared-memory systems. Distributed data structures support the illusion of shared-memory, allowing access to constituent data structure elements on demand, independent of physical location. However, the implementation of distributed data structures permits the tuning of data management policies to algorithmic and systemic conditions, while hiding the machinations of data motion from the programmer. The implementation also encourages "mixed-mode" programming, combining direct memory access with message passing, as performance and programmatic clarity warrant. This approach has the advantages of performance, flexability, abstraction, and system independence.
We evaluate the costs and benefits of these tunable, data management abstractions by mathematical modelling, discrete-event simulation, and with a prototype implementation for three generations of Intel message passing multiprocessors. The prototype system, christened Poli-C, provides a tunable, shared-memory system over a native message passing subsystem. Poli-C permits the assignment of specific data layouts, page sizes, and coherence protocols to individual data structures. This tunability is achieved via software generalization of hardware shared memory schemes. While such emulation incurs overhead, the concomitant gains in hit ratio and protocol efficiency from properly tuned data management often outweigh the costs.
The conclusions of this study indicate the utility of distributed data structures. The tunability provided by distributed data structures has significant impact on resulting performance. This tunability can be provided, without the need of aggressive hardware or compilation support, with a small software overhead. The resulting system enhances portability, programmability, and performance.
|Rights Information:||Copyright 1994 Totty, Brian Keith|
|Date Available in IDEALS:||2011-05-07|
|Identifier in Online Catalog:||AAI9512573|