Files in this item



application/pdfPLUMMER-DISSERTATION-2018.pdf (9MB)
(no description provided)PDF


Title:Grounding natural language phrases in images and video
Author(s):Plummer, Bryan A.
Director of Research:Lazebnik, Svetlana
Doctoral Committee Chair(s):Lazebnik, Svetlana
Doctoral Committee Member(s):Hockenmaier, Julia; Hoiem, Derek; Brown, Matthew
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Computer Vision, Natural Language Processing, Phrase Grounding
Abstract:Grounding language in images has shown it can help improve performance on many image-language tasks. To spur research on this topic, this dissertation introduces a new dataset which provides the ground truth annotations of the location of noun phrase chunks in image captions. I begin by introducing a constituent task termed phrase localization, where the goal is to localize an entity known to exist in an image when provided with a natural language query. To address this task, I introduce a model which learns a set of models, each of which capture a different concept which is useful in our task. These concepts can be predefined, such as attributes gleamed from the adjectives, as well as those which are automatically learned in a single-end-to-end neural network. I also address the more challenging detection style task, where the goal is to localize a phrase and determine if it is associated with an image. Multiple applications of the models presented in this work demonstrate their value beyond the phrase localization task.
Issue Date:2018-04-16
Rights Information:Copyright 2018 Bryan A. Plummer
Date Available in IDEALS:2018-09-04
Date Deposited:2018-05

This item appears in the following Collection(s)

Item Statistics