Files in this item



application/pdfFINNEGAN-DISSERTATION-2020.pdf (47MB)
(no description provided)PDF


Title:Understanding machine learning models of the epigenome with statistics and statistical mechanics
Author(s):Finnegan, Alex I
Director of Research:Song, Jun S
Doctoral Committee Chair(s):Oono, Yoshitsugu
Doctoral Committee Member(s):Aksimentiev, Aleksei; Chemla, Yann
Department / Program:Physics
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Artificial neural networks
Machine learning
Maximum entropy
Abstract:Epigenetic changes are chemical and structural modifications of DNA and its associated proteins which do not change DNA sequence. These modifications mark and package DNA in different ways and help to establish cell types, which are distinct and heritable gene expression states. Understanding determinants of epigenetic modifications and how these modifications affect gene expression is a major challenge with important implications in developmental biology and medicine. Meeting this challenge requires methods for predicting biologically relevant events from a large number of degrees of freedom that interact via unknown rules. The nature of this problem along with large amounts of data provided by high-throughput sequencing techniques motivates a machine-learning approach. This work uses artificial neural networks to predict binding of proteins involved in 3-dimensional organization of DNA as well as locations of methylation marks deposited by DNA methyltransferase enzymes. To understand the rules underlying the sequence-based prediction of our models, we apply interpretation methods based on sampling from constrained maximum entropy distributions. We consider biological and biophysical implications of the important sequence patterns revealed by interpretation. In the case of DNA methylation, our statistical methods help understand how methylation affects gene expression as well as how cells in our engineered yeast system response to DNA methylation stress. Finally, we study the diversity of single-cell gene expression across the cell types of the human skin and demonstrate coordination between changes in epigenetically modified loci and changes in expression of transcription factor proteins predicted to bind these loci.
Issue Date:2020-01-21
Rights Information:© 2020 by Alex I. Finnegan. All rights reserved.
Date Available in IDEALS:2020-08-26
Date Deposited:2020-05

This item appears in the following Collection(s)

Item Statistics