Files in this item



application/pdfSONG-THESIS-2019.pdf (954kB)
(no description provided)PDF


Title:Trainability and generalization of small-scale neural networks
Author(s):Song, Myung Hwan
Advisor(s):Sun, Ruoyu
Department / Program:Industrial&Enterprise Sys Eng
Discipline:Industrial Engineering
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Deep Learning
Neural Networks
Learning Theory
Abstract:As deep learning has become solution for various machine learning, artificial intelligence applications, their architectures have been developed accordingly. Modern deep learning applications often use overparameterized setting, which is opposite to what conventional learning theory suggests. While deep neural networks are considered to be less vulnerable to overfitting even with their overparameterized architecture, this project observed that properly trained small-scale networks indeed outperform its larger counterparts. The generalization ability of small-scale networks has been overlooked in many researches and practice, due to their extremely slow convergence speed. This project observed that imbalanced layer-wise gradient norm can hider overall convergence speed of neural networks, and narrow networks are vulnerable to this. This projects investigates possible reasons of convergence failure of small-scale neural networks, and suggests a strategy to alleviate the problem.
Issue Date:2019-04-22
Rights Information:Copyright 2019 Myung Hwan Song
Date Available in IDEALS:2019-08-23
Date Deposited:2019-05

This item appears in the following Collection(s)

Item Statistics