Files in this item

FilesDescriptionFormat

application/pdf

application/pdfWang_Gang.pdf (18MB)
(no description provided)PDF

Description

Title:Using the Internet for object image retrieval and object image classification
Author(s):Wang, Gang
Advisor(s):Forsyth, David A.
Contributor(s):Forsyth, David A.
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):Internet
Object image retrieval
Object image classification
Abstract:The Internet has become the largest repository for numerous resources, a big portion of which are images and related multimedia content such as text and videos. This content is valuable for many computer vision tasks. In this thesis, two case studies are conducted to show how to leverage information from the Internet for two important computer vision tasks: object image retrieval and object image classification. Case study 1 is on object image retrieval. With specified object class labels, we aim to retrieve relevant images found on web pages using an analysis of text around the image and of image appearance. For this task, we exploit established online knowledge resources (Wikipedia pages for text; Flickr and Caltech data sets for images). These resources provide rich text and object appearance information. We describe results on two data sets. The first is Berg’s collection of 10 animal categories; on this data set, we significantly outperform previous approaches. In addition, we have collected 5 more categories, and experimental results also show the effectiveness of our approach on this new data set. Case study 2 is on object image classification. We introduce a text-based image feature and demonstrate that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. We do not inspect or correct the tags and expect that they are noisy. We obtain the text features of an unannotated image from the tags of its k-nearest neighbors in this auxiliary collection. A visual classifier presented with an object viewed under novel circumstances (say, a new viewing direction) must rely on its visual examples. Our text feature may not change, because the auxiliary dataset likely contains a similar picture. While the tags associated with images are noisy, they are more stable when appearance changes. We test the performance of this feature using PASCAL VOC 2006 and 2007 datasets. Our feature performs well; it consistently improves the performance of visual object classifiers, and is particularly effective when the training dataset is small.
Issue Date:2010-01-06
URI:http://hdl.handle.net/2142/14623
Rights Information:Copyright 2009 Gang Wang
Date Available in IDEALS:2010-01-06
Date Deposited:December 2


This item appears in the following Collection(s)

Item Statistics