Nonlinear and geometric control methods in deep learning theory

Hanson, Joshua McKinley

Nonlinear and geometric control methods in deep learning theory

Hanson, Joshua McKinley

This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.

Permalink

https://hdl.handle.net/2142/127381

Description

Title

Nonlinear and geometric control methods in deep learning theory

Author(s)

Hanson, Joshua McKinley

Issue Date

2024-12-03

Director of Research (if dissertation) or Advisor (if thesis)

Raginsky, Maxim

Doctoral Committee Chair(s)

Raginsky, Maxim

Committee Member(s)

Baryshnikov, Yuliy
Belabbas, Mohamed Ali
Liberzon, Daniel

Department of Study

Electrical & Computer Eng

Discipline

Electrical & Computer Engr

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Neural networks
statistical learning theory
Rademacher complexity
deep learning
nonlinear control
geometric control

Abstract

The practical success and popularity of deep neural networks for solving modern machine learning problems motivates us to develop a rigorous theoretical understanding of how depth affects model expressivity and generalizability. Interpreting deep learning architectures as control systems unlocks versatile tools from nonlinear systems theory to study these models and associated statistical learning problems. In this dissertation, we present three main technical works taking advantage of this perspective, which are summarized as follows: The first work describes an encoder-decoder architecture for learning immersed submanifolds from data inspired by the structure of the group action in Sussmann’s orbit theorem, which is built from composing forward- and backward-in-time flow maps. We proceed to develop generalization bounds for this model class and apply these results to a handful of illustrative examples. The second work investigates a technique for proving generalization bounds for neural ordinary differential equations based on transforming the model into an equivalent infinite-dimensional kernel machine through the use of the Chen–Fliess expansion, which expresses the model output as an infinite series in terms of signature integrals of the control and iterated Lie derivatives of the output map. This technique differs from strategies based on bounding the covering number of the model class by propagating a parameter perturbation through the flow map, and instead takes advantage of standard tools applicable to kernel machines. The third work focuses on deriving quantitative approximation error bounds for neural ordinary differential equations having at most quadratic nonlinearities in the dynamics. The simple dynamics of this model form demonstrates how expressivity can be derived primarily from iteratively composing many basic elementary operations, versus from the complexity of those elementary operations themselves. These results contribute to our understanding of what depth imparts to the capabilities of deep learning architectures.

Graduation Semester

2024-12

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/127381

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Nonlinear and geometric control methods in deep learning theory

Hanson, Joshua McKinley

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In