Files in this item

FilesDescriptionFormat

application/pdf

application/pdfGARDNER-THESIS-2017.pdf (2MB)
(no description provided)PDF

Description

Title:Approximation of CPU code using neural networks
Author(s):Gardner, Conor S
Advisor(s):Kim, Nam S
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):Neural
Network
Code
Approximation
Training
Program
Trace
Convolution
Abstract:There is a well-known spectrum of computing hardware ranging from central processing units (CPUs) to highly specialized application specific integrated circuits (ASICs). Most consumer CPUs are general purpose and come with mature development tools used by large communities of programmers, while ASICs can perform very specific tasks very efficiently at the expense of ease-of-use and flexibility. Other devices such as digital signal processors (DSPs), graphics processing units (GPUs), and field programmable gate arrays (FPGAs) occupy intermediate interpolations on the usability-efficiency continuum. New development tools such as very long instruction word (VLIW) compilers, CUDA, and logic synthesis have made it easier than ever for even novice programmers to leverage the increased efficiency of DSP cores, GPUs, and FPGAs using specialized high-level programming languages for those devices. However, even after surmounting the steep learning curve, a skilled programmer will still require significantly more time to write and validate a CUDA or OpenCL function compared to an equivalent CPU function. Neural nets are fairly general purpose tools which can perform pattern recognition or arithmetic operations on a block of input data and produce a corresponding block of output data. The aim of this project is to be able to select a fairly arbitrary block of code such as a C++ function and train a neural net to mimic the original code's input-output behavior. Once the neural net has been trained, it can run on a highly parallel device such as a GPU without the programmer ever needing to write a CUDA program. Of course, this approach also has inherent drawbacks. First, all dependent processing which consumes output data from the neural net must be able to tolerate errors, since the network can only approximate the original code. Second, since neural nets require many, often unnecessary, floating point operations, there will be a large amount of “bloat” in the neural implementation which must be offset by the benefits gained by running the workload on a highly parallel device to be practical.
Issue Date:2017-04-26
Type:Thesis
URI:http://hdl.handle.net/2142/97481
Rights Information:Copyright 2017 Conor Gardner
Date Available in IDEALS:2017-08-10
Date Deposited:2017-05


This item appears in the following Collection(s)

Item Statistics