Files in this item



application/pdfYANG-DISSERTATION-2016.pdf (18MB)
(no description provided)PDF


Title:From image co-segmentation to discrete optimization in computer vision - the exploration on graphical model, statistical physics, energy minimization, and integer programming
Author(s):Yang, Huiguang
Director of Research:Ahuja, Narendra
Doctoral Committee Chair(s):Hasegawa-Johnson, Mark A.
Doctoral Committee Member(s):Huang, Thomas S.; Do, Minh N.
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):image co-segmentation
graphical model
energy minimization
integer programming
statistical physics
discrete optimization
Mixed-Integer Quadratic Programming (MIQP)
local topology consistency check
sparse optimization
Abstract:This dissertation aims to explore the ideas and frameworks for solving the discrete optimization problem in computer vision. Much of the work is inspired by the study of the image co-segmentation problem. It is through the research on this topic that the author has become very familiar with the graphical model and energy minimization point of view in handling computer vision problems - that is, how to combine the local information with the neighborhood interaction information in the graphical system for the inference; and also the author has come to the realization that many problems in and beyond computer vision can be solved in that way. At the beginning of this dissertation, we first give a comprehensive background review on graphical model, energy minimization, integer programming, as well as all their connections with the fundamental statistical physics. We aim to review the various aspects of the concepts, models, algorithms, etc., in a systematic way and from a different perspective. For instance, we review the correspondences between the commonly used unary/binary energy objective terms in computer vision with those of the fundamental Ising model in statistical physics; and also we summarize several widely used discrete energy minimization algorithms in computer vision under a unified framework in statistical physics; in addition we stress the close connections between the graphical model energy minimization and the integer programming problems, and especially we point out the central role of Mixed-Integer Quadratic Programming in discrete optimization in and beyond computer vision. Moreover, we explore the relationship between integer programming and energy minimization experimentally. We test integer programming methods on randomly generated energy formulations (as those would appear in computer vision problems), and similarly energy minimization methods on the integer programming problem of Graph K-coloring. Therefore we can easily compare the optimization performance of various methods (no matter whether they are designed for energy minimization or integer programming) on one platform. We come to the conclusion that sharing the methods across the fields (energy minimization in computer vision and integer programming in applied mathematics) is very helpful and beneficial. Based on the statistical physics inspired energy minimization framework we obtained, we formulate the task of density based clustering into this formulation. Energy is defined in terms of inhomogeneity in local point density. A sequence of energy minima are found to recursively partition the points, and thus we find a hierarchical embedding of clusters that are increasingly homogeneous in density. Energy is expressed as the sum of a unary (data) term and a binary (smoothness) term. The only parameter required to be specified by the user is a homogeneity criterion - the degree of acceptable fluctuation in density within a cluster. Thus, we do not have to specify, for example, the number of clusters present. Disjoint clusters with the same density are identified separately. Experimental results show that our method is able to handle clusters of different shapes, sizes and densities. We present the performance of our approach using the energy optimization algorithms ICM, LBP, Graph-cut, and Mean field theory algorithm. We also show that the family of commonly used spectral, graph clustering algorithms (such as Normalized-cut) is a special case of our formulation, using only the binary energy term while ignoring the unary term. After all the discussions above on the general framework for solving the discrete optimization problem in computer vision, the dissertation then focuses on the study of image co-segmentation, which is in fact carried out before the above topics. Image co-segmentation is the task of automatically discovering, locating and segmenting some unknown common object in a set of images. It has become a popular research topic in computer vision during recent years. The unsupervised nature is an important characteristic of the problem; i.e., the common object is a priori unknown. Moreover, the common object may be subject to viewpoint change, lighting condition change, occlusion, and deformation across the images; all these conditions make the co-segmentation task very challenging. In this part of the study we focus on the research of image co-segmentation and propose various approaches for addressing this problem. Most existing co-segmentation methods focus on co-segmenting the images with a very dominant common object, where the background interference is very limited. Such images are not realistic for the co-segmentation task, since in practice we may always encounter images with very rich and complex content where the common object is not dominant and appears simultaneously along with a large number of other objects. In this work we aim to address the image co-segmentation problem on this kind of image that cannot be handled properly with many previous methods. Two distinct approaches have been proposed in this work for image co-segmentation; the key difference lies in the method of common object discovery. The first approach is a "topology" based approach (also called a "point-region" approach) while the second one is a "sparse optimization" based approach. Specifically, in the first approach we combine the image key point features with the segment features together to discover the common object, while relying on the local topology consistency of both key point and segment layout for the robust recognition. The obtained initial foreground (the common object) in each image is refined through graphical model energy minimization based on a global appearance model extracted from the entire image dataset. The second approach is inspired by sparse optimization techniques; in this approach we use a sparse approximation scheme to find the optimal correspondence of the segments in two images as the initial estimation of the common object, based on some linear additive features extracted from the segments. In both proposed approaches, we emphasize the exploration of inter-image information in all steps of the algorithms; therefore, the common object need not to be dominant or salient in each individual image, as long as it is "common" across the image set. Extensive experiments have been conducted in this study to validate the performance of the proposed approaches. We carry out experiments on the widely used benchmark datasets for image co-segmentation, including iCoseg dataset, the multi-view co-segmentation dataset, Oxford flower dataset and so forth. Besides the above datasets, in order to better evaluate the performance on the rich and complex images with non-dominant common object, we also propose a new dataset in this work called richCoseg. Experiments are also conducted on this new dataset and qualitative and quantitative comparisons with the recent methods are provided. Finally, this dissertation also discusses very briefly some other vision problems the author has studied in previously published works.
Issue Date:2016-11-30
Rights Information:Copyright 2016 Huiguang Yang
Date Available in IDEALS:2017-03-01
Date Deposited:2016-12

This item appears in the following Collection(s)

Item Statistics