Files in this item
|(no description provided)|
|Title:||Reducing Interprocessor Communication in Parallel Architectures: System Configuration and Task Assignment|
|Author(s):||Abraham, Santosh George|
|Doctoral Committee Chair(s):||Davidson, Edward S.|
|Department / Program:||Electrical Engineering|
|Degree Granting Institution:||University of Illinois at Urbana-Champaign|
|Subject(s):||Engineering, Electronics and Electrical|
|Abstract:||Most future supercomputers will be based on parallel processing architectures. Currently, many applications with sufficient parallelism cannot efficiently run on large parallel processing systems because interprocessor communication and synchronization significantly degrade performance. Interprocessor communication is incurred on message passing systems when two processors exchange data through communication links and in shared memory systems when variables stored in shared memory are written into by one processor and then read out by another processor.
Two aspects of the communication bottleneck in multiprocessors, namely, delay and bandwidth, are identified. The tradeoff between the cost of the processors and the cost of the communication structure is analyzed and expressions are derived for the optimum number of processors that minimizes the total cost of achieving a desired performance level. Maximum speedups are limited by communication delays for most combinations of applications and parallel architectures.
Communication delay models are developed for a hierarchical multiprocessor where all processors share a global memory and processors are grouped into disjoint clusters, each with a cluster memory shared by the processors in that cluster. Three different classes of algorithms are analyzed and techniques are derived to determine the optimum cluster size, i.e., the number of processors in each cluster, that minimizes average communication delays and hence completion times. The effectiveness of hierarchical multiprocessors in emulating common interconnection schemes, viz., ring, mesh, and hypercube, is analyzed.
The solution of sparse systems of equations on the Alliant FX/8 multiprocessor is used to explore computation-communication tradeoffs. Blocking improves communication time but reduces parallelism. Optimal block sizes for various blocking schemes are determined both on the Alliant FX/8 and on an emulation of a multiprocessor with slower global memory. The execution of a linear system solver, Block Solve, on a hierarchical multiprocessor is simulated and optimum cluster sizes that minimize completion times are determined.
Algorithms that use flow graph techniques to assign tasks to processors with the objective of minimizing the sum total of all communication costs are described. An algorithm that obtains a complete assignment with a worst-case bound of less than twice the optimal is described.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1988.
|Date Available in IDEALS:||2014-12-15|
This item appears in the following Collection(s)
Dissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer Engineering
Graduate Dissertations and Theses at Illinois
Graduate Theses and Dissertations at Illinois