Files in this item
Files | Description | Format |
---|---|---|
application/pdf ![]() | (no description provided) |
Description
Title: | Self-supervised learning of spatiotemporal features from video colorization |
Author(s): | Pahuja, Zubin |
Advisor(s): | Forsyth, David A |
Department / Program: | Computer Science |
Discipline: | Computer Science |
Degree Granting Institution: | University of Illinois at Urbana-Champaign |
Degree: | M.S. |
Genre: | Thesis |
Subject(s): | colorization
self-supervised learning tracking video |
Abstract: | We pose video colorization as a self-supervised learning problem for visual tracking. We use large amounts of freely available unlabeled video from YouTube to learn colorization without explicit supervision. However, instead of predicting the color directly from the gray-scale frame, we constrain the model to solve this task by learning to copy colors from a reference frame. By equipping the model with a pointing mechanism into a reference frame, we learn an explicit spatiotemporal feature representation that can be used as a generic tracker for new tracking tasks without additional training or fine-tuning. Our self-supervised model can propagate any annotation from the first frame as a reference to the rest of the video. Experimental results suggest that the learned feature representations can be effectively transferred to video tracking and object segmentation tasks. We perform extensive quantitative and qualitative evaluations on the DAVIS-2017 video object segmentation dataset and demonstrate significant improvements over the baseline. Although the model is trained without any ground-truth labels, our method learns to track well enough to outperform the latest methods based on optical flow. Since annotating videos is expensive and tracking has many applications in robotics and graphics, we believe learning to track with self-supervision can have a large impact. More broadly, we show that the features learned from a task for which cheap training data is readily available can be used to learn a task which would otherwise require an expensive, large-scale dataset with minimal supervision. Thus, we hope our results encourage a broader exploration in the promising field of self-supervised learning. |
Issue Date: | 2019-07-19 |
Type: | Text |
URI: | http://hdl.handle.net/2142/105728 |
Rights Information: | Copyright 2019 Zubin Pahuja |
Date Available in IDEALS: | 2019-11-26 |
Date Deposited: | 2019-08 |
This item appears in the following Collection(s)
-
Dissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer Science -
Graduate Dissertations and Theses at Illinois
Graduate Theses and Dissertations at Illinois