Files in this item



application/pdfECE499-Sp2019-jeong.pdf (892kB)Restricted to U of Illinois
(no description provided)PDF


Title:A hybrid compressed deep neural network implementation on FPGA to balance accuracy and latency
Author(s):Jeong, Paul
Contributor(s):Chen, Deming
Subject(s):machine learning
hardware acceleration
convolutional neural network
high-level synthesis
network compression
Abstract:Compression technologies for deep neural networks (DNNs) have been widely investigated to reduce the DNN model size so that they can be implemented on hardware with strict resource restrictions. However, one major disadvantage of model compression is accuracy degradation, which can easily lead to dissatisfactions in real-life applications. To solve this problem, we propose a new compressed network inference scheme, with both high accuracy and low-resource DNN combined, to adapt to different scenarios and well balance the DNN inference accuracy and total resource usage. The proposed design can deliver overall accuracy close to the high accuracy model, while using limited DSP resources. We demonstrate our design on an image classification task with AlexNet-like backbone networks for the case study. The result showed that our design can increase the throughput by 1.7x with only 4.7% additional DSPs, and our inference mechanism recovered more than 75% of accuracy drop caused by extreme network compression.
Issue Date:2019-05
Date Available in IDEALS:2019-06-19

This item appears in the following Collection(s)

Item Statistics