Files in this item



application/pdf3395564.pdf (1MB)Restricted to U of Illinois
(no description provided)PDF


Title:Large-Scale Constraint-Based Pattern Mining
Author(s):Zhu, Feida
Doctoral Committee Chair(s):Jeff Erickson; Han, Jiawei
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Computer Science
Abstract:We studied the problem of constraint-based pattern mining for three different data formats, item-set, sequence and graph, and focused on mining patterns of large sizes. Colossal patterns in each data formats are studied to discover pruning properties that are useful for direct mining of these patterns. For item-set data, we observed robustness of colossal patterns. By defining the concept of core patterns, we developed a randomized mining framework to efficiently find the set of colossal patterns which gives a good approximation to the complete pattern set. The essential idea of pattern fusion and leaping toward large patterns is then extended to the cases of sequential and graph data. In sequential data, we developed a novel algorithm to accommodate approximate patterns. For graph data, we proposed the concept of spiders and used these pre-computed frequent structures of small sizes to quickly leap to reach those much larger ones. We also proposed a general graph mining framework, called gPrune, to take advantage of both pattern and data space pruning. Ideas and techniques developed in this work can be extended to handle other user-specified constraints for direct efficient mining in large-scale data.
Issue Date:2009
Description:89 p.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2009.
Other Identifier(s):(MiAaPQ)AAI3395564
Date Available in IDEALS:2015-09-25
Date Deposited:2009

This item appears in the following Collection(s)

Item Statistics