Files in this item

FilesDescriptionFormat

application/pdf

application/pdfPATIL-DISSERTATION-2021.pdf (10MB)
(no description provided)PDF

Description

Title:Harnessing noise to enhance robustness vs. efficiency trade-off in machine learning
Author(s):Patil, Ameya D.
Director of Research:Shanbhag, Naresh R
Doctoral Committee Chair(s):Shanbhag, Naresh R
Doctoral Committee Member(s):Schwing, Alexander; Hanumolu, Pavan; Roy Choudhury, Romit
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Deep Learning
Adversarial Robustness
Computational Efficiency
Energy-efficient computing
Beyond-CMOS computing
in-memory computing
resistive crossbar arrays
Abstract:While deep nets have achieved human-comparable accuracy in various classification tasks, they fall short significantly in terms of the robustness and cost metrics. For example, tiny engineered corruptions in deep net inputs can reduce their accuracy to zero. Furthermore, deep nets also require millions of trainable parameters, resulting in significant training and inference costs. These robustness and cost challenges are well recognized today. In response, there have been a plethora of works focusing on improving either the accuracy vs. robustness trade-off, or the accuracy vs. cost trade-off. However, simultaneous consideration of accuracy, robustness, and cost metrics is largely absent today, in part, because far fewer works have explored the robustness vs. cost trade-off. This dissertation aims to fill this gap by focusing explicitly on the robustness vs. cost trade-off in the presence of data noise, as well as hardware noise. Specifically, we explore how to harness the noise in order to enhance this trade-off. We characterize and improve robustness vs. cost trade-offs across diverse problem settings, ranging from beyond-CMOS hardware implementations of machine learning (ML) classifiers to efficient training of deep nets that are robust to multiple types of corruptions in their inputs. This dissertation can be roughly divided into two part, one focusing on hardware noise and the other on data noise. In the first part, we start by focusing on harnessing noise in spintronic hardware implementations, where the logic gates become error prone when operated at lower switching energy/delay. We propose techniques to shape the resulting hardware noise distribution and to efficiently compensate it at the system-level output. As a result, we observe 1000x improvement intolerance to gate-level switching error rates, while keeping the area/energy overhead of compensation circuits to as low as 15%. These robustness enhancements further enable 3× reduction in iso-throughput energy consumption of a binary ML classifier employed for EEG-based seizure detection. Building on this work, we propose spintronic channel networks, exponential decay of spin current to efficiently realize multi-bit dot product computation. We employ error-prone nanomagnets as efficient stochastic slicers biased by spin currents proportional to the likelihood of the classification decision. We achieve 112x-to-22.5x and 14x-to-2.5x higher energy-efficiency over conventional spin-based and 20 nm CMOS designs, respectively, when realizing 10-to-100-dimensional binary classifiers. Furthermore, we also consider the impact of hardware noise originated from process variations and readout circuits in in-memory computing implementations employing non-volatile resistive crossbar arrays. Based on our analysis, we identify design configurations achieving the highest signal-to-noise ratio (SNR), and further estimate how such robustness trades off with the array energy consumption. In the second part, we switch gears to improve the robustness vs. cost trade-off for deep nets in the presence of data noise. Specifically, we focus on the impact of adversarial perturbations in the deep nets inputs. We propose and validate the hypotheses about orientations of dominant subspaces of adversarial perturbations. We demonstrate how changes in the curvature of decision boundary of the deep nets affects the orientations of the adversarial perturbations. Based on these insights we demonstrate how shaped noise can be introduced as a feature to enhance robustness vs. cost trade-off in deep nets. Specifically, we propose shaped noise augmented processing (SNAP), a method to efficiently train deep nets that are robust to multiple types of adversarial perturbations, simultaneously. SNAP prepends a deep net with a shaped noise augmentation layer whose distribution is learned along with the network parameters using any established robust training framework. Based on extensive comparisons with nine state-of-the-art (SOTA) robust training frameworks, we show that SNAP achieves the best robustness vs. training cost trade-off. In particular, it enables 4x reduction in the training cost compared to the SOTA approach published just this last year. Furthermore, thanks to the computational simplicity of SNAP, it is the first technique of its kind that is scalable to large datasets, such as ImageNet.
Issue Date:2021-10-19
Type:Thesis
URI:http://hdl.handle.net/2142/113824
Rights Information:Copyright 2021 Ameya Patil
Date Available in IDEALS:2022-04-29
Date Deposited:2021-12


This item appears in the following Collection(s)

Item Statistics