Director of Research (if dissertation) or Advisor (if thesis)
Banerjee, Arindam
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Overparameterization
Optimization
Deep Operator Networks
Language
eng
Abstract
Neural Operators that directly learn mappings between function spaces have received considerable recent attention. Deep Operator Networks (DeepONets), a popular recent class of operator networks have shown promising preliminary results in approximating solution operators of parametric partial differential equations. Despite the universal approximation guarantees there is yet no optimization convergence guarantee for DeepONets based on gradient descent (GD). In this thesis, we establish such guarantees and show that overparameterization based on wide layers provably helps. In particular, we present two types of optimization convergence analysis: first, for smooth activations, we bound the spectral norm of the Hessian of DeepONets and use the bound to show geometric convergence of GD based on restricted strong convexity (RSC); and second, for ReLU activations, we show the neural tangent kernel (NTK) of DeepONets at initialization is positive definite, which can be used with the standard NTK analysis to imply geometric convergence. Further, we present empirical results on three canonical operator learning problems: Antiderivative, DiffusionReaction equation, and Burger’s equation, and show that wider DeepONets lead to lower training loss on all the problems, thereby supporting the theoretical results
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.