Files in this item

FilesDescriptionFormat

application/pdf

application/pdfDONG-THESIS-2018.pdf (5MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:Weakly supervised learning from referring expression: Challenge and directions
Author(s):Dong, Taiyu
Advisor(s):Hoiem, Derek
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):weakly supervised learning
object localization
segmentation from natural language
object localization from nature language
Abstract:We explore methods of weakly supervised learning from referring expression. Unlike traditional fully supervised semantic segmentation of object recognition tasks, in which a a small set of discrete class bases is provided, the referring expression task is performed associated with a sentence phrase, e.g. “the dude on the dolphin”. Previous approaches use LSTM and fully convolutional network and have fairly good results under fully supervised setting. However, the fully supervised setting is limited by manual labeling of segmentation masks, which requires a significant amount of human labor. Therefore, we work on an approach to perform segmentation with only image level language descriptions. Under our weakly supervised setting, we are only provided with input images and the corresponding sentence descriptions, without the pixel level labeling for each image as ground truth. In order to get supervision only from language description, we utilize the multiple instance learning loss. We first develop an end-to-end model to localize the image content corresponding to the language expressions. In this model, we use GloVe and ELMo sentence embeddings to get a vector representation for each sentence and combined with image features from a fully convolutional network. However, the sentence level model is hard to interpret hence we also study a more fundamental problem of weakly supervised object localization from referring expressions. We compare the performance of the sentence level model on this task to an alternative word-level model. Our investigation suggests that breaking the referring expressions localization problem into smaller more manageable components is promising.
Issue Date:2018-12-13
Type:Thesis
URI:http://hdl.handle.net/2142/102864
Rights Information:Copyright 2018 Taiyu Dong
Date Available in IDEALS:2019-02-07
Date Deposited:2018-12


This item appears in the following Collection(s)

Item Statistics