Files in this item

FilesDescriptionFormat

application/pdf

application/pdfAGRAWAL-THESIS-2019.pdf (1MB)Restricted Access
(no description provided)PDF

Description

Title:Efficient inference of convolutional neural networks on general purpose hardware using weight repetition
Author(s):Agrawal, Rohit
Advisor(s):Fletcher, Christopher W.
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):Deep Neural Networks
Convolutional Neural Networks
Accelerator
CPU
GPU
Deep Learning Hardware
CNN Inference
Abstract:Deep Neural Networks (DNNs) have begun to permeate all corners of electronic society due to their high accuracy and machine efficiency per operation. Recent work has shown how weights within and across DNN filters have large degrees of repetition due to the pigeonhole principle and modern weight quantization schemes, and that this weight repetition can be harnessed improve DNN inference efficiency in an accelerator/ASIC context. This thesis develops new techniques so that weight repetition leads to an efficiency gain on general-purpose and programmable SIMD-based architectures such as CPUs equipped with vector extensions. We show how to write high-performance software that does not require hardware modifications and can cope with the irregularity introduced by weight repetition schemes. Overall, our highly parallel software kernel achieves up to 1:51 speedup in runtime of inference over state-of-the-art baseline.
Issue Date:2019-04-24
Type:Text
URI:http://hdl.handle.net/2142/105251
Rights Information:Copyright 2019 Rohit Agrawal
Date Available in IDEALS:2019-08-23
Date Deposited:2019-05


This item appears in the following Collection(s)

Item Statistics