Sampling, non-linearities and scales: rethinking convolutional neural networks from a signal processing perspective

Chaman, Anadi

Sampling, non-linearities and scales: rethinking convolutional neural networks from a signal processing perspective

Chaman, Anadi

Permalink

https://hdl.handle.net/2142/125624

Description

Title

Sampling, non-linearities and scales: rethinking convolutional neural networks from a signal processing perspective

Author(s)

Chaman, Anadi

Issue Date

2024-07-12

Director of Research (if dissertation) or Advisor (if thesis)

Dokmanić, Ivan

Doctoral Committee Chair(s)

Dokmanić, Ivan

Committee Member(s)

Bresler, Yoram
Weiss, Yair
Zhao, Zhizhen

Department of Study

Electrical & Computer Eng

Discipline

Electrical & Computer Engr

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

convolutional neural networks
shift invariance
mesh-free super-resolution
mesh-free generative models
reliable deep learning
convolutional sparse coding
multi-scale models

Abstract

A large body of work from signal processing and optimization has traditionally been used to address tasks in imaging inverse problems. These include approaches like multi-scale signal analysis, compressed sensing, sparse representations, etc. While these traditional methods are mathematically tractable and provide reliable solutions, their performance has always been limited by their inability to model complex distributions beyond hand-crafted priors. Deep convolutional neural networks (CNNs), on the other hand, have been immensely successful in the last decade and have provided state-of-the-art (SOTA) performance on various challenging tasks—thanks to their ability to model sophisticated priors directly from data. While these advancements are indeed substantial, crevices in their performance have been revealed in recent years, often caused when the underlying architectures are designed without paying heed to principles from signal processing and inverse problems. For example, the performance of CNN classifiers can be extremely brittle to merely a single pixel shift in image: an effect resulting from downsampling by pooling layers. Additionally, when using black-box CNNs to solve inverse problems, ignoring the geometry of the underlying task can sometimes result in poor generalization beyond training distribution. In this thesis, we aim to bridge the gap between traditional methods and the deep convolutional approaches for inverse problems, thereby combining the benefits of high reliability with the excellent performance of deep learning on challenging tasks. By taking inspiration from signal processing, we design new and conceptually simple architectural modifications in existing convolutional neural networks that enable provable robustness gains, improved out-of-distribution generalization, and architectural simplicity and interpretability. For example, we propose new adaptive sampling (pooling) layers which provably restore perfect robustness to shifts in convolutional neural networks. With our solution, the resulting networks exhibit perfect consistency to shifts even before training, thereby making it the first approach that makes CNNs truly shift-invariant. We also propose a convolutional, mesh-free and continuous super-resolution architecture that can super-resolve an image of any size to arbitrary resolution. While being conceptually simple and 10x smaller than the SOTA model, the proposed network can also be used for building continuous generative models capable of solving PDE based inverse problems at scales unseen during training. In addition to above, we approach our goal of combining the benefits of deep learning and signal processing methods through another route. We take a traditional, interpretable approach from signal processing and introduce modifications to match its performance with deep learning methods. Specifically, inspired by the U-Net architecture, we propose a new multi-scale convolutional dictionary and use it to solve image reconstruction tasks using the convolutional sparse coding (CSC) approach. We show that this enables CSC to close the performance gap with SOTA deep neural networks on various challenging inverse problems, while still remaining mathematically tractable. Finally, we argue that the poor reliability of deep learning models can partly be attributed to their lack of interpretability and mathematical tractability. An improved understanding of a CNN's internal mechanisms can allow us to identify potential sources of brittleness in the network and design better architectures. To do so, we study the popular U-Net architecture and its ability to recover sharp, high frequency edge information from its inputs that primarily contain smooth and low-pass measurements. A crucial task that is needed to solve many inverse problems, we hope that this could reveal insights into why the U-Net is able to provide near state of the art performance on a large variety of imaging applications. Taking a spectrum extrapolation perspective, we first identify the mechanisms used by the individual components of U-Net to generate high frequencies from low-pass data and examine their properties. Taking a bottom-up approach, we then experimentally study the effectiveness and robustness of different variants of U-Net, and investigate the factors that result in those behaviors. For example, our experiments on deblurring edges suggest a potential link between the spectral distribution of a U-Net's input and the need for greater number of scales and depth in the network—an observation that we explain using the `spectral spread' properties of the ReLU activation. Similarly, using experiments with Fourier shell correlation and finite width neural tangent kernels, we make a case that aliased frequencies generated by stride layers can actively be utilized by a strided U-Net to efficiently generate new frequencies. This additionally allows us to explain our observations that stride layers are highly powerful yet brittle spectrum generators.

Graduation Semester

2024-08

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/125624

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Sampling, non-linearities and scales: rethinking convolutional neural networks from a signal processing perspective

Chaman, Anadi

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In