Files in this item

FilesDescriptionFormat

application/pdf

application/pdf3269964.pdf (2MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:Bicriterion Clustering and Selecting the Optimal Number of Clusters via Agreement Measure
Author(s):Liu, Heng
Doctoral Committee Chair(s):Douglas Simpson
Department / Program:Statistics
Discipline:Statistics
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Statistics
Abstract:Clustering and classification have been important tools to address a broad range of problems in fields such as image analysis, genomics, and many other areas. Basically, these clustering problems can be simplified as two aspects. The first is to estimate the number of clusters. The second one is to allocate each observation to the clusters. Many different heuristic criteria are available. The representative models are k-means, hierarchical clustering and partitioning around medoids. Among these methods, there exists the problem to select the number of clusters. In addition, some algorithms make use of a starting allocation of the observations, such as k-means, which may contain the inherent bias. Often the data partitioning will suffer lack of consistency across different criteria and algorithms. In this thesis, we propose an approach to select the number of clusters through comparing and optimizing the agreement between two clustering criteria. The intuition is that the clustering randomness from different criteria should be minimized when the true clustering structure is recovered. By maximizing the agreement on allocation of the observations between different methods, it selects the optimal number of clusters and also results in a robust consensus set of clusters. Furthermore we use a number of classification rules to combine the resultant clusters from two algorithms. The favorable performance of the method is demonstrated in simulation studies and fMRI time series application. Finally the asymptotic properties of the agreement statistics are discussed.
Issue Date:2007
Type:Text
Language:English
Description:101 p.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2007.
URI:http://hdl.handle.net/2142/87410
Other Identifier(s):(MiAaPQ)AAI3269964
Date Available in IDEALS:2015-09-28
Date Deposited:2007


This item appears in the following Collection(s)

Item Statistics