Files in this item
|(no description provided)|
|Title:||Interconnection networks and data prefetching for large-scale multiprocessors: Design and performance|
|Doctoral Committee Chair(s):||Veidenbaum, Alexander V.|
|Department / Program:||Computer Science|
|Degree Granting Institution:||University of Illinois at Urbana-Champaign|
|Abstract:||Increasing computing power demands higher memory performance than ever before, and memory access becomes a more serious bottleneck in high-performance computer systems. Therefore, reducing memory access latency or hiding the latency is crucial to achieving high performance. In shared memory multiprocessor systems, where memory access must traverse interconnection networks, the system performance and costs are directly affected by interconnection networks. Latency hiding techniques such as data prefetching affect interconnection network and memory performance by demanding more bandwidth. This dissertation studies interconnection networks and data prefetching.
We establish a theoretical framework for routing in shuffle-exchange networks and develop an algorithm for shortest path routing in single stage shuffle-exchange networks. Single stage shuffle-exchange networks are attractive because of their relatively low cost and shorter average internode distance as compared to multistage shuffle-exchange networks. The theoretical framework and routing algorithm allow network size to be any multiple of the switch size. We evaluate the effect of shortest path routing on memory system performance.
Three different network topologies are evaluated by varying system size, switch size and channel width under various design constraints. These networks are multistage and single stage shuffle-exchange networks and multidimensional torus networks. By employing detailed trace-driven simulations of an entire system, we conduct an extensive comparative study of these interconnection networks in terms of performance and cost. In general, multistage shuffle-exchange networks are the best network topology when cost is not the main limiting factor. Otherwise, single stage shuffle-exchange networks are the best network topology when cost is a main limiting factor. Multidimensional torus networks were seriously limited by their long average internode distance.
We investigate the effect of data prefetching on interconnection networks and vice versa. When data prefetching is effective, the performance difference caused by different networks tends to be reduced. The high memory bandwidth demand of prefetching requires that the number of outstanding requests be controlled to effectively utilize network and memory bandwidth. Finally, we develop and evaluate a prefetching scheme that enables stride-directed prefetching at the second-level of a memory hierarchy and outside a processor chip. Using the prefetching scheme, we compare three different second-level memory organizations: traditional caches, prefetching without a cache, and prefetching with a small cache. Prefetching with a small cache shows strong potential to be an efficient second-level memory organization.
|Rights Information:||Copyright 1995 Kim, Sunil|
|Date Available in IDEALS:||2011-05-07|
|Identifier in Online Catalog:||AAI9624392|