Files in this item



application/pdfJohn_Stratton.pdf (2MB)
(no description provided)PDF


Title:Performance portability of parallel kernels on shared-memory systems
Author(s):Stratton, John
Director of Research:Hwu, Wen-Mei W.
Doctoral Committee Chair(s):Hwu, Wen-Mei W.
Doctoral Committee Member(s):Chen, Deming; Lumetta, Steven S.; Padua, David A.
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Performance Portability
Abstract:This work describes my solution to the performance portability problem: between CPUs and GPUs in particular, but laying the foundation for even broader performance portability support. I argue that the best approach is to use a language like OpenCL as a portable, low-level programming model with well-defined mechanisms for expressing multi-level parallelism and locality. That low-level program representation can be supported with architecture-specific compilers, runtimes, and libraries to target the application code to various platforms with high performance. High-level language designers or tool developers could then target this single, low-level programming and parallelism model as a portable, high-performance intermediate program representation. To demonstrate the feasibility of this approach, I show how one would design a good CPU implementation of OpenCL given that the programs are written according to the current high-level GPU vendor optimization guidelines. Programs written in such a way already meet the criteria of good GPU performance, and in this work, I show that those same programs on a CPU platform implemented according to my proposals can out-perform an OpenMP implementation of the same algorithm on the same system.
Issue Date:2013-05-24
Rights Information:Copyright 2013 John Stratton
Date Available in IDEALS:2013-05-24
Date Deposited:2013-05

This item appears in the following Collection(s)

Item Statistics