Files in this item



application/pdf3131060.pdf (6MB)Restricted to U of Illinois
(no description provided)PDF


Title:Data Mining via Support Vector Machines: Scalability, Applicability, and Interpretability
Author(s):Yu, Hwan-Jo
Doctoral Committee Chair(s):Han, Jiawei
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Computer Science
Abstract:KDD (Knowledge Discovery and Data mining) has been extensively studied in the last decade as data is continuously increasing in size and complexity. This thesis introduces three practical data mining problems---(1) classifying with large data sets, (2) classifying without negative data (i.e., single-class classification), and (3) discovering discriminant feature combinations---and presents solutions that are based on a principled methodology, i.e., Support Vector Machines (SVMs), to produce higher quality results with less human intervention. We first address several challenges in adopting SVM technology to the practice of data mining: (1) scalability: SVMs are unscalable to data size while common data mining applications often involve millions or billions of data objects, (2) applicability: SVMs are limited to (semi-) supervised learning which is mostly applied to binary classification problems, and (3) interpretability: It is hard to interpret and extract knowledge from SVM models. We then propose three principled solutions, which address these challenges, for the problems of the large-scale classification, the single-class classification, and the discriminant feature combination discovery. The contributions of this thesis cover the applications of bioinformatics and text-and-Web mining as well as methodologies of data mining and machine learning.
Issue Date:2004
Description:85 p.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2004.
Other Identifier(s):(MiAaPQ)AAI3131060
Date Available in IDEALS:2015-09-25
Date Deposited:2004

This item appears in the following Collection(s)

Item Statistics