Files in this item



application/pdfSP20-ECE499-Thesis-Liu, Hanhaotian.pdf (880kB)Restricted to U of Illinois
(no description provided)PDF


Title:Accelerating large sparse deep neural networks inference
Author(s):Liu, Hanhaotian
Contributor(s):Hwu, Wen-mei
Sparse Neural Networks
Abstract:This thesis presents a few methods to accelerate the inference of Deep Neural Networks that are large and sparse using GPUs. Deep Neural Networks are now widely used in many applications in various fields, such as computer vision and speech recognition. Deep Neural Networks tend to work more accurately when the model is larger with more layers and neurons, but this makes the model size grow, which causes problems in transferring the data and storing the model in limited fast memory, and it also increases the number of computations, which slows the speed of network inference. The first problem can be solved by using sparse networks with comparable accuracy that contain less weights and thus are smaller in size, and this thesis intends to solve the inference speed problem caused by increased number of computations. To achieve the goal, various ways to manipulate the computation process and to parallelize the inference with multiple devices are tested against networks of different sizes and MNIST dataset as input. The characteristics of the networks and the intermediate results after each layer were also examined for optimizing the implementations. Each method used in the implementation was able to improve the inference performance by some amount, and they showed that this kind of network has a great potential to be parallelized and accelerated.
Issue Date:2020-05
Date Available in IDEALS:2020-06-11

This item appears in the following Collection(s)

Item Statistics