Files in this item

FilesDescriptionFormat

application/pdf

application/pdfLifetime Reliability Aware Microprocessors.pdf (1MB)
(no description provided)PDF

Description

Title:Lifetime Reliability Aware Microprocessors
Author(s):Srinivasan, Jayanth
Subject(s):microprocessor
computer science
Abstract:Ensuring long-term, or ``lifetime'' reliability, as dictated by the hard error rate due to wear-out based failures, is a critical requirement for microprocessor manufacturers. At the same time, the steady increases in CMOS processor performance have been driven by aggressive device scaling. This continuous scaling coupled with increasing temperatures on chip are making lifetime reliability targets increasingly difficult to meet. This dissertation addresses lifetime reliability issues from a microarchitectural perspective. Our key contributions include (i) the first architecture-level methodology for evaluating lifetime reliability, as a function of application behavior, (ii) a quantification of the impact of device scaling on lifetime reliability, taking workload characteristics into consideration, and (iii) performance and cost-effective architectural solutions targeted at enhancing lifetime reliability. The first part of this dissertation focuses on the design of tools and models to evaluate processor lifetime reliability. Using industrial strength models for lifetime reliability modes, we develop a methodology, called RAMP, to estimate lifetime reliability from an architectural and application perspective. We propose two implementations of RAMP, RAMP 1.0 and RAMP 2.0, which differ in their utility and accuracy. This dissertation also extends the RAMP methodology by adding scaling models for different technology generations to its failure mechanisms. Our quantification of the impact of scaling on a contemporary superscalar processor shows that device scaling has a significant detrimental impact on processor hard failure rates. The second part of this dissertation examines a range of microarchitectural techniques for lifetime reliability enhancement. In contrast to previous application-oblivious methods, these techniques allows processor designers to trade-off cost, performance, and reliability in an application-aware fashion. First, we propose dynamic reliability management (DRM) where the processor uses adaptive hardware to dynamically respond to changing application behavior to maintain its lifetime reliability target. Our results show that DRM enables the processor to extract significant performance benefit for a spectrum of reliability design costs. Next, we study two techniques that leverage microarchitectural structural redundancy for lifetime reliability enhancement. Structural redundancy has the potential to be more cost and performance effective than traditional processor redundancy. In structural duplication, redundant microarchitectural structures are added to the processor and designated as spares. Spare structures can be turned on when the original structure fails, increasing the processor's lifetime. Graceful processor degradation is a technique that exploits existing microarchitectural redundancy for reliability. Redundant structures that fail are shut down while still maintaining functionality, thereby increasing the processor's lifetime, but at a lower performance. Our evaluation shows significant reliability benefit from these techniques for a range of cost and performance budgets. Overall, this dissertation lays the basic foundation for microarchitectural analysis of lifetime reliability and provides new tools and techniques to handle this critical emerging technology challenge.
Issue Date:2006-05
Genre:Technical Report
Type:Text
URI:http://hdl.handle.net/2142/11176
Other Identifier(s):UIUCDCS-R-2006-2614
Rights Information:You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).
Date Available in IDEALS:2009-04-20


This item appears in the following Collection(s)

Item Statistics