Files in this item

FilesDescriptionFormat

application/pdf

application/pdfROCK-DISSERTATION-2019.pdf (86MB)
(no description provided)PDF

Description

Title:Learning and evaluating image representations
Author(s):Rock, Jason
Director of Research:Forsyth, David
Doctoral Committee Chair(s):Forsyth, David
Doctoral Committee Member(s):Lazebnik, Svetlana; Schwing, Alexander G.; Barron, Jonathan
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Computer Vision
Deep Learning
Image processing
Image representation
intrinsic images
intrinsic image decomposition
rain removal
OCR
Abstract:Prior to deep learning it was common to approach computer vision problems as describing a model that could be learned from a relatively small amount of data by incorporating domain knowledge. For example, image prediction tasks such as intrinsic image decomposition were approached by thinking about what reflectance and shading look like. In the case of reflectance, a Mondrian image; and in the case of shading, a smooth image. The difficult portion was how to formalize this prior domain knowledge into a model. Deep learning has changed this paradigm. While deep learning hasn’t eliminated the value of domain knowledge, for many problems we now think in terms of model architectures and losses instead. While a choice of model architecture limits the types of results possible, neural networks tend to be less task dependent than domain specific methods. In fact, for almost any problem there is a fairly simple formula for using neural networks to get good results. 1. Collect labeled data, 2. Choose a network architecture, 3. Define a loss and train. However, there are still tasks where we might not be able to collect a lot of labeled data of a particular form (Grave OCR), or tasks where we can’t easily describe an unambiguous loss on easily collected data (Intrinsic Image Decomposition, Image correction including rain, cracks and glare), or a task where we want to do many similar tasks without having to train each one independently (face adjustment). A unifying theme of my work is that generic representations can be learned from data and those learned representation can be used to make otherwise under-constrained problems tractable. Pre- deep learning this generic representation takes the form of a LEARCH-based model more recent work builds on auto-encoder representations. For authoring decompositions and removing rain, cracks, and glare, autoencoder models are learned from fake data and then shown to be applicable on real images. For learning to decompose rainy images cycle consistency losses are incorporated to learn without examples of de-rained images. In Face-to-Face transformation, an attribute sensitive image-to-image representation is pretrained and then a low dimensional representation for image attribute transformations is described. In Grave OCR we learn to generate data and learn the image decomposition model simultaneously, allowing us to learn how to predict image annotations without labeled data. Finally in evaluating intrinsic image decomposition, we explore evaluating intrinsic image models using human perception annotations. We show that human annotation evaluation has some issues and does not appear to differentiate between qualitatively different models. We propose a new task-specific procedure for evaluating intrinsic image decomposition using re- painting and reshading and show that it can be used to identify differences between model that are currently unidentified.
Issue Date:2019-12-03
Type:Text
URI:http://hdl.handle.net/2142/106236
Rights Information:Copyright 2019 Jason Rock
Date Available in IDEALS:2020-03-02
Date Deposited:2019-12


This item appears in the following Collection(s)

Item Statistics