Files in this item

FilesDescriptionFormat

application/pdf

application/pdfMALLYA-DISSERTATION-2018.pdf (49MB)Restricted Access
(no description provided)PDF

Description

Title:Learning and adapting visual models for multiple specialized tasks
Author(s):Mallya, Arun Mohanray
Director of Research:Lazebnik, Svetlana
Doctoral Committee Chair(s):Lazebnik, Svetlana
Doctoral Committee Member(s):Forsyth, David; Hoiem, Derek; Shakhnarovich, Gregory
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Action Recognition, Visual Relationship Detection, Image Situations, Multi Task Training
Abstract:A key requirement for any agent that wishes to interact with the visual world is the ability to understand the behavior of objects in the scene, primarily through visual means. We humans, through our cognitive system, are able to localize other people and objects in scenes, understand their relationship to the surrounding environment, and reason about not only their actions and attributes, but also about concepts which require knowledge beyond what is afforded by the pixels in visual input, such as possible future states, motion, a person’s motivations, and so on. In this thesis, we outline work that takes small steps towards solving this daunting task of replicating the human visual cognitive system. This dissertation presents methods for predicting actions, interactions with objects, and increasingly structured scenarios from single images. We devise simple methods that make use of a variety of cues by taking into account the structure inherent in the tasks we aim to solve. We show that by solving these tasks as an intermediate step and using their outputs as features, we can develop methods that operate on visual and language inputs to improve performance on tasks that require high-level image information, such as answering questions about images and producing captions for images. One issue that accompanies the learning of multiple tasks with separate deep networks, such as the work described above, is the need to store separate models, which increases storage requirements and affects scalability. We formulate and present two novel methods that draw inspiration from network pruning and weight quantization that can reuse parts of an existing network for learning new tasks with minimal additional overhead, without hurting performance on tasks that were learned earlier.
Issue Date:2018-04-15
Type:Thesis
URI:http://hdl.handle.net/2142/101314
Rights Information:Copyright 2018 Arun Mallya
Date Available in IDEALS:2018-09-04
Date Deposited:2018-05


This item appears in the following Collection(s)

Item Statistics