Files in this item



application/pdfHsien Ting_Cheng.pdf (11MB)
(no description provided)PDF


Title:Unsupervised video segmentation and its application to activity recognition
Author(s):Cheng, Hsien Ting
Director of Research:Ahuja, Narendra
Doctoral Committee Chair(s):Ahuja, Narendra
Doctoral Committee Member(s):Forsyth, David A.; Hasegawa-Johnson, Mark A.; Huang, Thomas
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Video segmentation
Unsupervised clustering
Activity recognition
Multiple instance learning
Abstract:We addressed the fundamental problem of computer vision: segmentation and recognition, in the space-time domain. With the knowledge that generic image segmentation introduces unstable regions due to illumination, com- pression, etc., we utilized temporal information to achieve consistent 3D video segmentation. By exploiting non-local structure in both spatial and temporal space, the instabilities of the segmented regions were alleviated. A segmentation tree was built within every frame, and the label consistency was enforced within each subtree (i.e. spatial clique). By roughly tracking 2D regions across each frame, temporal clique was built in which label consis- tency was enforced as well. The high-order (more than binary) Conditional Random Field (CRF) is designed and solved efficiently. Experimental results demonstrate high-quality segmentation quantitatively and qualitatively. Taking segmented 3D regions, called tubes, as input, we developed an activity recognition framework not only to determine which activity existed in a video but also to locate where it happens. A robust tube feature was extracted with photometric and shape dynamics information. Activity was described as a Parts Activity Model (PAM) with a root template and four- part template under the root. Given the nature of the activity recognition problem that only some parts on the video were used to determine the activity label, we used Multiple Instance Learning (MIL) to formulate the problem. Latent variables included a tube index and the parts location under the root template. Experiments were conducted on three well-known datasets and a state-of-the-art result was achieved.
Issue Date:2015-01-21
Rights Information:Copyright 2014 Hsien Ting Cheng
Date Available in IDEALS:2015-01-21
Date Deposited:2014-12

This item appears in the following Collection(s)

Item Statistics