Files in this item

FilesDescriptionFormat

application/pdf

application/pdfXU-DISSERTATION-2017.pdf (12MB)
(no description provided)PDF

Description

Title:Image and video object selection
Author(s):Xu, Ning
Director of Research:Huang, Thomas
Doctoral Committee Chair(s):Huang, Thomas
Doctoral Committee Member(s):Hasegawa-Johnson, Mark; Lazebnik, Svetlana; Liang, Zhi-Pei
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Object selection
Computer vision
Deep learning
Image segmentation
Video segmentation
Abstract:Image and video object selection present fundamental research problems in the computer vision field and have many practical applications. They are important technologies in image and video editing, film production, robotics and autonomous driving etc. Previous methods have serious limitations for those tasks for several reasons. First, most of them use some low-level, handcrafted features which are not optimal. Second, they also lack the high-level understanding of "objectness" and semantics. Last but not the least, their generalization ability on unconstrained scenarios is very poor. Recently, deep learning has become the dominant method for computer vision tasks including recognition and detection since it cannot only learn good feature representation in an end-to-end manner but it is also effective at capturing the high-level semantics. However, its exploration in image and video object selection is still impoverished. Therefore, in this thesis we propose several novel deep-learning based methods to tackle the limitations in image and video object selection. Our algorithms are easy to understand and effective. Experimental results clearly demonstrate the superiority of our algorithms over previous methods. Some highlights include the following: (1) Our interactive segmentation algorithm is the first deep-learning based algorithm and achieves the state-of-the-art results on both small-scale and large-scale benchmarks. (2) Our rectangle-based algorithm novelly transforms rectangle inputs to attention-like distance maps and achieves robust performance for sloppy user selections or misplaced detection boxes. (3) Our image matting algorithm is the first to demonstrate the feasibility of learning an alpha matte end-to-end given an image and trimap. It also achieves state-of-the-art results on image matting and video matting benchmarks. (4) Our video object segmentation method combines CNN network with RNN memory cells to learn both good image feature representation and the temporal-spatial coherence.
Issue Date:2017-12-06
Type:Text
URI:http://hdl.handle.net/2142/99515
Rights Information:Copyright 2017 Ning Xu
Date Available in IDEALS:2018-03-13
2020-03-14
Date Deposited:2017-12


This item appears in the following Collection(s)

Item Statistics