Files in this item



application/pdfPaul_Bissonnette.pdf (1MB)
(no description provided)PDF


Title:Extraction and identification of frequent sequential patterns in transcription factor binding site organization of enhancers
Author(s):Bissonnette, Paul
Advisor(s):Ma, Jian
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Frequent Sequential Pattern Mining (FSPM)
Data Mining
Transcription Factor (TF)
Transcription Factor Binding Site (TFBS)
Abstract:The advent of laboratory techniques to assess protein-DNA interactions, and chromatin post-translational modifications has created vast arrays of data correlating binding sites and regulatory elements with genome regions. There has been great interest in developing new computational approaches that leverage these annotations in order to better understand the layout and organization of eukaryotic genomes and gene regulation. In particular, while tools exist to accurately map the coding regions of the genome, few methods exist to leverage new annotations in the non-coding regions of the genome that may hold the key to many important questions about gene regulation. By exploiting the results of large scale annotation studies such as ENCODE to identify cell-type specific enhancers, and transcription factor binding sites, it is possible to better understand enhancer regions of the genome responsible for promoting transcriptional activity. Using frequent sequential pattern mining techniques from the classical data mining field, the importance of linear order in enhancer role and structure was explored in this thesis. Common orderings of binding sites within enhancers were identified across nine cell lines. Putative targets of enhancers exhibiting these patterns were then determined and clustered based on functional classification. Examination of detected patterns indicated that while the choice of transcription factors in the pattern often correlated with the overall function of putative targets, the ordering of these binding sites might be an effective classifier of more specific functional activity. Additionally, findings suggested that the arrangement of binding sites within enhancers was more likely to be cell specific than the transcription factor binding sites themselves. The knowledge that binding site pattern is more strongly linked to target function than binding site co-association alone could lead to important advances in our understanding of the diverse regulatory roles of different enhancers as well as new functional annotations for enhancers, genes, and transcription factors that have not yet been the focus of intensive studies to elucidate their roles.
Issue Date:2014-05-30
Rights Information:Copyright 2014 Paul Bissonnette
Date Available in IDEALS:2014-05-30
Date Deposited:2014-05

This item appears in the following Collection(s)

Item Statistics