Files in this item

FilesDescriptionFormat

application/pdf

application/pdfSHANG-THESIS-2017.pdf (1MB)
(no description provided)PDF

Description

Title:DPPred: an effective prediction framework with concise discriminative patterns and its biomedical applications
Author(s):Shang, Jingbo
Advisor(s):Han, Jiawei
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):discriminative pattern
explanatory
concise
classification
regression
biomedical application
Abstract:In the literature, two series of models have been proposed to address prediction problems including classification and regression. Simple models, such as generalized linear models, have ordinary performance but strong interpretability on a set of simple features. The other series, including tree-based models, organize numerical, categorical and high dimensional features into a comprehensive structure with rich interpretable information in the data. In this thesis, we propose a novel discriminative pattern-based prediction framework (DPPred) to accomplish the prediction tasks by taking their advantages of both effectiveness and interpretability. Specifically, DPPred adopts the concise discriminative patterns that are on the prefix paths from the root to leaf nodes in the tree-based models. Moreover, DPPred selects a limited number of the useful discriminative patterns by searching for the most effective pattern combination to fit generalized linear models. To validate the effectiveness of DPPred, we conduct experiments on both classification and regression tasks. Experimental results demonstrate that DPPred provides competitive accuracy with the state-of-the-art as well as the valuable interpretability for developers and experts. In particular, when studying health status for cardiopulmonary patients, DPPred shows the acceptable predicting accuracy (more than 95%) and reveals the importance of demographic features; when studying the amyotrophic lateral sclerosis (ALS) disease, DPPred not only outperforms the baselines by using only 40 concise discriminative patterns out of a potentially exponentially large set of patterns, but also discover novel markers.
Issue Date:2017-04-24
Type:Thesis
URI:http://hdl.handle.net/2142/97418
Rights Information:Copyright 2017 Jingbo Shang
Date Available in IDEALS:2017-08-10
Date Deposited:2017-05


This item appears in the following Collection(s)

Item Statistics