Files in this item

FilesDescriptionFormat

application/pdf

application/pdfTran_Duan.pdf (6MB)
(no description provided)PDF

Description

Title:Structure prediction for human parsing
Author(s):Tran, Duan
Director of Research:Forsyth, David A.
Doctoral Committee Member(s):Ahuja, Narendra; Hoiem, Derek W.; Ramanan, Deva
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Image Parsing
Human Pose Parsing
People Parsing
Object Detection
Structure Prediction
Abstract:This thesis shows that structure prediction is well-suited for detecting and parsing people in images (and videos) due to the advantage of learning local part appearance models jointly with relationships between body parts. In detecting people, this method can deal with hard cases, for example, a person mounting a bicycle, that are uncommon in the training data and can cause current person detectors to fail. This thesis demonstrates a pedestrian finder which first finds the most likely human pose in the window using a discriminative procedure trained with structure learning on a small dataset, then presents features based on that configuration to an SVM classifier. This thesis shows, using the INRIA Person dataset, that estimates of configuration significantly improve the accuracy of a discriminative pedestrian finder. This thesis shows quantitative evidence that a full relational model of the body performs better at upper body parsing than the standard tree model, despite the need to adopt approximate inference and learning procedures. The method uses an approximate search for inference, and an approximate structure learning method to learn. This thesis compares this method to state of the art methods on a dataset prepared at UIUC (which depicts a wide range of poses), on the standard Buffy dataset, and on the reduced PASCAL dataset published recently. Results suggest that the Buffy dataset over emphasizes poses where the arms hang down, and that leads to generalization problems. Despite the superior performance of a full relational model to a tree structure model, its practical use is still limited because it must deal with the high complexity in inference. This thesis shows a method to boost a parser with poselet pruners. The method first develops a cascade of hierarchical poselet pruners to prune the search space to a small set of part states and then builds a hierarchical poselet parser to find part locations on the pruned set. Experiments on the UIUC Sport dataset shows that the poselet pruners can effectively prune away more than 99.6\% of unlikely part states to about 500 states per part. This small set of part states allows the use of advanced appearance models for better parsers. The method achieves performance comparable to state-of-the-art methods' while improves the speed of finding part locations several times.
Issue Date:2012-02-01
Genre:thesis
URI:http://hdl.handle.net/2142/29473
Rights Information:Copyright 2011 Duan Tran
Date Available in IDEALS:2014-02-01
Date Deposited:2011-12


This item appears in the following Collection(s)

Item Statistics