Operator learning in the overparameterized regime

Shrimali, Bhavesh

Operator learning in the overparameterized regime

Shrimali, Bhavesh

Permalink

https://hdl.handle.net/2142/120438

Description

Title

Operator learning in the overparameterized regime

Author(s)

Shrimali, Bhavesh

Issue Date

2023-05-01

Director of Research (if dissertation) or Advisor (if thesis)

Banerjee, Arindam

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

M.S.

Degree Level

Thesis

Keyword(s)

Overparameterization
Optimization
Deep Operator Networks

Language

eng

Abstract

Neural Operators that directly learn mappings between function spaces have received considerable recent attention. Deep Operator Networks (DeepONets), a popular recent class of operator networks have shown promising preliminary results in approximating solution operators of parametric partial differential equations. Despite the universal approximation guarantees there is yet no optimization convergence guarantee for DeepONets based on gradient descent (GD). In this thesis, we establish such guarantees and show that overparameterization based on wide layers provably helps. In particular, we present two types of optimization convergence analysis: first, for smooth activations, we bound the spectral norm of the Hessian of DeepONets and use the bound to show geometric convergence of GD based on restricted strong convexity (RSC); and second, for ReLU activations, we show the neural tangent kernel (NTK) of DeepONets at initialization is positive definite, which can be used with the standard NTK analysis to imply geometric convergence. Further, we present empirical results on three canonical operator learning problems: Antiderivative, DiffusionReaction equation, and Burger’s equation, and show that wider DeepONets lead to lower training loss on all the problems, thereby supporting the theoretical results

Graduation Semester

2023-05

Type of Resource

Text

Handle URL

https://hdl.handle.net/2142/120438

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Siebel School of Computer Science

Operator learning in the overparameterized regime

Shrimali, Bhavesh

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In