Files in this item



application/pdfLINKOWSKI-THESIS-2018.pdf (2MB)
(no description provided)PDF


Title:GeneSet MAPR: Characterization of gene sets through heterogeneous network patterns
Author(s):Linkowski, Gregory
Advisor(s):Vasudevan, Shobha
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):graph theory
machine learning
pattern recognition
big data
statistical analysis
Abstract:Often, machine learning and big data concepts are applied to problems without a proper appreciation of their limitations or domain context. At the same time there is a growing appreciation for the ability of networks to represent more complex connections between data points than previous structures. However, established machine learning approaches rarely take advantage of such structures and must be adapted. We present here a method that utilizes patterns of connections within heterogeneous networks to score items by their similarity to an input set. We apply the idea of meta-paths as an abstraction to counteract typical big data problems of noise and overfitting. We also aim to demystify the black-box nature of machine learning by providing intuitive feedback about why items are considered similar. While the method presented here is generalizable to any domain, the specific examples explored are within the genomics domain. The final tool, GeneSet MAPR, is especially useful in a domain with little ground truth and a huge volume of noisy, uncertain data. We show that GeneSet MAPR performs better at discovering related but concealed data points than an approach using the same data without abstraction, as well as a an established state-of-the-art approach that works on a network but ignores the heterogeneous patterns. It does this while providing details the other methods cannot.
Issue Date:2018-04-24
Rights Information:Copyright 2018 Gregory Linkowski
Date Available in IDEALS:2018-09-04
Date Deposited:2018-05

This item appears in the following Collection(s)

Item Statistics