Files in this item



application/pdfPeng_Jiang.pdf (11MB)
(no description provided)PDF


Title:Pattern extraction and clustering for high-dimensional discrete data
Author(s):Jiang, Peng
Director of Research:Heath, Michael T.
Doctoral Committee Chair(s):Heath, Michael T.
Doctoral Committee Member(s):Olson, Luke N.; Zhai, ChengXiang; Park, Haesun
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):low-rank matrix factorization
binary matrix factorization
k-means clustering
approximation algorithm
pattern extraction
association rule mining
document clustering
weighted binary matrix factorization
bicluster discovery
densest k-subgraph
social network mining
Abstract:We explore connections of low-rank matrix factorizations with interesting problems in data mining and machine learning. We propose a framework for solving several low-rank matrix factorization problems, including binary matrix factorization, constrained binary matrix factorization, weighted constrained binary matrix factorization, densest k-subgraph, and orthogonal nonnegative matrix factorization. These combinatorial problems are NP-hard. Our goal is to develop effective approximation algorithms with good theoretical properties and apply them to solve various real application problems. We reformulate each of the problems as a special clustering problem that has the same optimal solution as the corresponding original problem. Making use of this property, we develop clustering algorithms to solve corresponding low-rank matrix factorization problems. We prove that most of our clustering algorithms have constant approximation ratios, which is a highly desirable property for NP-hard problems. We apply the proposed algorithms and compare them with existing methods for real applications in pattern extraction, document clustering, transaction data mining, recommender systems, bicluster discovery in gene expression data, social network mining, and image representation.
Issue Date:2014-01-16
Rights Information:Copyright 2013 Peng Jiang
Date Available in IDEALS:2014-01-16
Date Deposited:2013-12

This item appears in the following Collection(s)

Item Statistics