Files in this item



application/pdfLIN-THESIS-2016.pdf (1MB)
(no description provided)PDF


Title:Towards the integration of genomic profiles and gene interaction networks for machine learning
Author(s):Lin, Henry A
Advisor(s):Han, Jiawei
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Drug response
Random Walk with Restart
Network Propagation
Abstract:With the advent of big data, scientists are collecting biological data faster than they have in the past, including genomic profiles which describe individuals by thousands of genes at a time. Adding to this library of knowledge are gene interaction networks, which model overarching cellular processes by describing how genes interact with each other. When approached with genomic profile data together with gene interaction data, it becomes a question of how to integrate these two pieces of knowledge together for machine learning. Previous studies have attempted to employ some form of feature engineering process to "collapse" the network topology alongside the genomic profiles, losing the potential for global network information. Instead, we explore a framework based upon network propagation. We explain how network propagation algorithms can enhance standalone genomic profiles, called embeddings, and show these enhancements lead to improved predictive accuracies on drug response classification. We next show that these embeddings contain predictive signals that are not necessarily implicated by gene ranking methods such as PageRank. Last, we apply network propagation to a dataset presented by the DREAM organization, and show we can improve a naive linear regression that solves for a drug sensitive ranking task.
Issue Date:2016-04-27
Rights Information:Copyright May 2016 by Henry A. Lin. All rights reserved.
Date Available in IDEALS:2016-07-07
Date Deposited:2016-05

This item appears in the following Collection(s)

Item Statistics