Withdraw
Loading…
Sampling, non-linearities and scales: rethinking convolutional neural networks from a signal processing perspective
Chaman, Anadi
Loading…
Permalink
https://hdl.handle.net/2142/125624
Description
- Title
- Sampling, non-linearities and scales: rethinking convolutional neural networks from a signal processing perspective
- Author(s)
- Chaman, Anadi
- Issue Date
- 2024-07-12
- Director of Research (if dissertation) or Advisor (if thesis)
- Dokmanić, Ivan
- Doctoral Committee Chair(s)
- Dokmanić, Ivan
- Committee Member(s)
- Bresler, Yoram
- Weiss, Yair
- Zhao, Zhizhen
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- convolutional neural networks
- shift invariance
- mesh-free super-resolution
- mesh-free generative models
- reliable deep learning
- convolutional sparse coding
- multi-scale models
- Abstract
- A large body of work from signal processing and optimization has traditionally been used to address tasks in imaging inverse problems. These include approaches like multi-scale signal analysis, compressed sensing, sparse representations, etc. While these traditional methods are mathematically tractable and provide reliable solutions, their performance has always been limited by their inability to model complex distributions beyond hand-crafted priors. Deep convolutional neural networks (CNNs), on the other hand, have been immensely successful in the last decade and have provided state-of-the-art (SOTA) performance on various challenging tasks—thanks to their ability to model sophisticated priors directly from data. While these advancements are indeed substantial, crevices in their performance have been revealed in recent years, often caused when the underlying architectures are designed without paying heed to principles from signal processing and inverse problems. For example, the performance of CNN classifiers can be extremely brittle to merely a single pixel shift in image: an effect resulting from downsampling by pooling layers. Additionally, when using black-box CNNs to solve inverse problems, ignoring the geometry of the underlying task can sometimes result in poor generalization beyond training distribution. In this thesis, we aim to bridge the gap between traditional methods and the deep convolutional approaches for inverse problems, thereby combining the benefits of high reliability with the excellent performance of deep learning on challenging tasks. By taking inspiration from signal processing, we design new and conceptually simple architectural modifications in existing convolutional neural networks that enable provable robustness gains, improved out-of-distribution generalization, and architectural simplicity and interpretability. For example, we propose new adaptive sampling (pooling) layers which provably restore perfect robustness to shifts in convolutional neural networks. With our solution, the resulting networks exhibit perfect consistency to shifts even before training, thereby making it the first approach that makes CNNs truly shift-invariant. We also propose a convolutional, mesh-free and continuous super-resolution architecture that can super-resolve an image of any size to arbitrary resolution. While being conceptually simple and 10x smaller than the SOTA model, the proposed network can also be used for building continuous generative models capable of solving PDE based inverse problems at scales unseen during training. In addition to above, we approach our goal of combining the benefits of deep learning and signal processing methods through another route. We take a traditional, interpretable approach from signal processing and introduce modifications to match its performance with deep learning methods. Specifically, inspired by the U-Net architecture, we propose a new multi-scale convolutional dictionary and use it to solve image reconstruction tasks using the convolutional sparse coding (CSC) approach. We show that this enables CSC to close the performance gap with SOTA deep neural networks on various challenging inverse problems, while still remaining mathematically tractable. Finally, we argue that the poor reliability of deep learning models can partly be attributed to their lack of interpretability and mathematical tractability. An improved understanding of a CNN's internal mechanisms can allow us to identify potential sources of brittleness in the network and design better architectures. To do so, we study the popular U-Net architecture and its ability to recover sharp, high frequency edge information from its inputs that primarily contain smooth and low-pass measurements. A crucial task that is needed to solve many inverse problems, we hope that this could reveal insights into why the U-Net is able to provide near state of the art performance on a large variety of imaging applications. Taking a spectrum extrapolation perspective, we first identify the mechanisms used by the individual components of U-Net to generate high frequencies from low-pass data and examine their properties. Taking a bottom-up approach, we then experimentally study the effectiveness and robustness of different variants of U-Net, and investigate the factors that result in those behaviors. For example, our experiments on deblurring edges suggest a potential link between the spectral distribution of a U-Net's input and the need for greater number of scales and depth in the network—an observation that we explain using the `spectral spread' properties of the ReLU activation. Similarly, using experiments with Fourier shell correlation and finite width neural tangent kernels, we make a case that aliased frequencies generated by stride layers can actively be utilized by a strided U-Net to efficiently generate new frequencies. This additionally allows us to explain our observations that stride layers are highly powerful yet brittle spectrum generators.
- Graduation Semester
- 2024-08
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/125624
- Copyright and License Information
- Copyright 2024 Anadi Chaman
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…