Files in this item



application/pdf8823067.pdf (8MB)Restricted to U of Illinois
(no description provided)PDF


Title:Reducing Interprocessor Communication in Parallel Architectures: System Configuration and Task Assignment
Author(s):Abraham, Santosh George
Doctoral Committee Chair(s):Davidson, Edward S.
Department / Program:Electrical Engineering
Discipline:Electrical Engineering
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Engineering, Electronics and Electrical
Abstract:Most future supercomputers will be based on parallel processing architectures. Currently, many applications with sufficient parallelism cannot efficiently run on large parallel processing systems because interprocessor communication and synchronization significantly degrade performance. Interprocessor communication is incurred on message passing systems when two processors exchange data through communication links and in shared memory systems when variables stored in shared memory are written into by one processor and then read out by another processor.
Two aspects of the communication bottleneck in multiprocessors, namely, delay and bandwidth, are identified. The tradeoff between the cost of the processors and the cost of the communication structure is analyzed and expressions are derived for the optimum number of processors that minimizes the total cost of achieving a desired performance level. Maximum speedups are limited by communication delays for most combinations of applications and parallel architectures.
Communication delay models are developed for a hierarchical multiprocessor where all processors share a global memory and processors are grouped into disjoint clusters, each with a cluster memory shared by the processors in that cluster. Three different classes of algorithms are analyzed and techniques are derived to determine the optimum cluster size, i.e., the number of processors in each cluster, that minimizes average communication delays and hence completion times. The effectiveness of hierarchical multiprocessors in emulating common interconnection schemes, viz., ring, mesh, and hypercube, is analyzed.
The solution of sparse systems of equations on the Alliant FX/8 multiprocessor is used to explore computation-communication tradeoffs. Blocking improves communication time but reduces parallelism. Optimal block sizes for various blocking schemes are determined both on the Alliant FX/8 and on an emulation of a multiprocessor with slower global memory. The execution of a linear system solver, Block Solve, on a hierarchical multiprocessor is simulated and optimum cluster sizes that minimize completion times are determined.
Algorithms that use flow graph techniques to assign tasks to processors with the objective of minimizing the sum total of all communication costs are described. An algorithm that obtains a complete assignment with a worst-case bound of less than twice the optimal is described.
Issue Date:1988
Description:227 p.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1988.
Other Identifier(s):(UMI)AAI8823067
Date Available in IDEALS:2014-12-15
Date Deposited:1988

This item appears in the following Collection(s)

Item Statistics