Files in this item



application/pdfDURRANI-THESIS-2020.pdf (1MB)Restricted Access
(no description provided)PDF


Title:Utilizing GPU tensor cores for algorithmic acceleration
Author(s):Durrani, Sultan Hayat Khan
Advisor(s):Hwu, Wen-Mei W
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Computer Architecture
Tensor Cores
Abstract:There has been a surge in the demand for a Domain Specific Architecture due to wide ranging deep learning applications like Image classification, speech recognition, in healthcare, self-driving cars etc. Matrix Multiplication acceleration has been a popular design choice when creating these specialized units to boost deep learning training and inference. Nvidia's Volta architecture introduced Tensor Cores which promised a 3 times speedup over their Pascal architecture. Despite the favorable performance gains, these accelerators have not been applied extensively to a wider class of algorithms. Through this thesis we introduce novel ways of mapping various algorithms on the Tensor Cores. We implemented Tensor Core based reduction, power iteration and Fast Fourier Transform (FFT) and show that effectively utilizing GPU compute resources would result in substantial gains in performance. Our reduction gave a 1.5 times speedup against CUB API; power iteration gave on average 2 times the speedup against Thrust and cuBLAS based implementation while our FFT implementation was able to outperform cuFFT with up to 8 times the speedup.
Issue Date:2020-05-13
Rights Information:Copyright 2020 Sultan Hayat Khan Durrani
Date Available in IDEALS:2020-08-27
Date Deposited:2020-05

This item appears in the following Collection(s)

Item Statistics