Files in this item

FilesDescriptionFormat

application/pdf

application/pdfSINGH-THESIS-2021.pdf (899kB)
(no description provided)PDF

Description

Title:Toward predictable execution of real-time workloads on modern GPUs
Author(s):Singh, Jayati
Advisor(s):Caccamo, Marco
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):GPU
real-time
spatial partitioning
warp
scheduling
predictable execution
Abstract:Over the last decade, real-time systems have witnessed a major increase in computational demands, which cannot be met by existing multi-core processors. Graphics processing units (GPUs) are a cost-effective solution to serve such systems. The high throughput and energy efficiency offered by GPUs has led to their widespread adoption. Most real-time systems today have multiple tasks utilizing the GPU, and GPUs are getting bigger (more processing units) with every generation. Hence, prior solutions that give each task exclusive access to the GPU are no longer feasible from a real-time as well as cost perspective. This necessitates predictable GPU multi-tasking, which unfortunately cannot be trivially achieved in modern GPUs. New spatial and temporal scheduling policies need to be explored and enforced in modern GPUs to enable predictable execution of GPU tasks. Therefore, this thesis investigates two approaches to achieve predictable execution on NVIDIA GPUs. The first approach involves executing different tasks on disjoint sets of GPU processing units, that is, spatial partitioning (SP). There has been considerable effort by the industry and research community to enable GPU SP. However, leveraging SP to improve schedulability still needs to be investigated thoroughly. Therefore, we propose heuristics to partition the GPU into sets of processing units and assign tasks to each partition, with a goal of increased utilization while respecting the tasks' timing constraints. The second approach to enforce multi-tasking on GPUs is simultaneous multi-kernel (SMK). SMK arbitrates between tasks at the lowest level of execution, namely, at the warp level. We propose a real-time priority aware warp scheduler and study its performance when compared against kernel agnostic policies like loose-round-robin and greedy-then-oldest, which are implemented in NVIDIA hardware today. We implement and evaluate our proposed warp scheduling policy on GPGPU-Sim.
Issue Date:2021-04-27
Type:Thesis
URI:http://hdl.handle.net/2142/110571
Rights Information:Copyright 2021 Jayati Singh
Date Available in IDEALS:2021-09-17
Date Deposited:2021-05


This item appears in the following Collection(s)

Item Statistics