Files in this item
|(no description provided)|
|Title:||Design and Analysis of Fault-Tolerant Processor Arrays for Numerical Applications|
|Department / Program:||Computer Science|
|Degree Granting Institution:||University of Illinois at Urbana-Champaign|
|Abstract:||The availability of fast devices in low cost and high density technologies promises a major breakthrough in future supercomputer designs, especially in the design of highly concurrent processors. Locally interconnected processor arrays, such as systotic arrays, are well suited to efficiently implement a major class of numerical algorithms due to their massive parallelism and regular structure. However, the successful operation of a processor array depends very much on the correctness of all the processing elements in the array. Any single faulty processor may easily jeopardize the results of the whole system. This makes the fault tolerance a very important issue in the design of processor arrays, and is the subject of this thesis.
In the first part of this thesis, a fault tolerance scheme using an encoding based on the linear property is proposed; this can be applied to a class of processor arrays where each processor in the regular part of the array is a linear system. Many algorithms in digital signal processing and matrix operations are shown to be mapped to systems which belong to this class.
The computations of eigenvalues and singular values are key to many applications including signal and image processing. In the second part of this thesis, fault tolerance schemes are proposed for the computation of eigenvalues and singular values on several high-performance processor arrays which were proposed recently. Special properties of each algorithm are used to perform the concurrent error detection. Since, in most cases, the data encoding is not necessary, the introduced overhead is extremely low in terms of both hardware and time redundancy.
In the next section, a novel concurrent error detection technique using residue codes is proposed, which can be applied to the processor arrays derived from signal flow graphs.
After detecting an error, fault location is performed either through some special algorithms or through the use of time redundancy. Then, the reconfiguration process isolates the faulty unit and allows the system to resume its normal operations. In the final part, efficient reconfiguration techniques are proposed, which can be applied to a wide range of architectures, including binary trees, rectangular arrays, and shuffle-exchange processor arrays.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1987.
|Date Available in IDEALS:||2014-12-15|