Files in this item



application/pdfCrago_Neal.pdf (3MB)
(no description provided)PDF


Title:Energy-efficient latency tolerance for 1000-core data parallel processors with decoupled strands
Author(s):Crago, Neal
Director of Research:Patel, Sanjay J.
Doctoral Committee Chair(s):Patel, Sanjay J.
Doctoral Committee Member(s):Hwu, Wen-Mei W.; Lumetta, Steven S.; Chen, Deming
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Parallel Processing
Graphics processing unit (GPU)
General-purpose computing on graphics processing units (GPGPU)
latency tolerance
decoupled architecture
compiler technique
low power
low energy
Abstract:This dissertation presents a novel decoupled latency tolerance technique for 1000-core data parallel processors. The approach focuses on developing instruction latency tolerance to improve performance for a single thread. The main idea behind the approach is to leverage the compiler to split the original thread into separate memory-accessing and memory-consuming instruction streams. The goal is to provide latency tolerance similar to high-performance techniques such as out-of-order execution while leveraging low hardware complexity similar to an in-order execution core. The research in this dissertation supports the following thesis: Pipeline stalls due to long exposed instruction latency are the main performance limiter for cached 1000-core data parallel processors. Leveraging natural decoupling of memory-access and memory-consumption, a serial thread of execution can be partitioned into strands providing energy-efficient latency tolerance. This dissertation motivates the need for latency tolerance in 1000-core data parallel processors and presents decoupled core architectures as an alternative to currently used techniques. This dissertation discusses the limitations of prior decoupled architectures, and proposes techniques to improve both latency tolerance and energy-efficiency. Finally, the success of the proposed decoupled architecture is demonstrated against other approaches by performing an exhaustive design space exploration of energy, area, and performance using high-fidelity performance and physical design models.
Issue Date:2012-09-18
Rights Information:Copyright 2012 Neal Crago
Date Available in IDEALS:2012-09-18
Date Deposited:2012-08

This item appears in the following Collection(s)

Item Statistics