Files in this item

FilesDescriptionFormat

application/pdf

application/pdfFAN-DISSERTATION-2021.pdf (23MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:Sparse representation in deep vision models
Author(s):Fan, Yuchen
Director of Research:Hasegawa-Johnson, Mark
Doctoral Committee Chair(s):Hasegawa-Johnson, Mark
Doctoral Committee Member(s):Liang, Zhi-Pei; Smaragdis, Paris; Shi, Humphrey
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):deep learning
sparse
computer vision
Abstract:Sparse representation plays a critical role in vision problems, including generation and understanding. Image generation tasks are inherently ill-posed, where the input signal usually has insufficient information while the output has infinitely many solutions w.r.t. the same input. Thus, it is commonly believed that sparse representation is more robust to handle the considerable diversity of solutions. Image understanding also depends on invariant and robust sparse representation for various transformations, e.g., color, lighting, viewpoint, etc. Deep neural networks extend the sparse coding-based methods from linear structure to cascaded linear and non-linear structures. However, sparsity of hidden representation in deep neural networks cannot be solved by iterative optimization as sparse coding, since deep networks are feed-forward during inference. I invented a method that can structurally enforce sparsity constraints upon hidden neurons in deep networks but also keep representation in high dimensionality. Given high-dimensional neurons, I divide them into groups along channels and allow only one group of neurons to be non-zero each time. The adaptive selection of the non-sparse group is modeled by tiny side networks upon context features. And computation is also saved when only performed on the non-zero group. I further extended the sparse constraints to an attention mechanism. Attention mechanism is built upon paired correlation between any two pixels and needs quadratic computation cost respecting to the input size. This mutual correlation is inherently sparse, since pixels in a single image are not necessary highly correlated to most of other pixels. I proposed a method to achieve more efficient computation of attention mechanism given the sparse prior of correlation matrix. I also investigated the sparse scene representation modeled with deep neural networks. With sparsely rendered views of a 3D scene, the proposed deep neural network approach performs spatiotemporal reconstruction of high-definition images from a novel viewpoint efficiently.
Issue Date:2021-12-03
Type:Thesis
URI:http://hdl.handle.net/2142/114008
Rights Information:Copyright 2021 Yuchen Fan
Date Available in IDEALS:2022-04-29
Date Deposited:2021-12


This item appears in the following Collection(s)

Item Statistics