Files in this item



application/pdfZHAO-DISSERTATION-2016.pdf (2MB)
(no description provided)PDF


Title:Bioinformatics analyses of non-coding genomic elements
Author(s):Zhao, Kai
Director of Research:Roca, Alfred L.
Doctoral Committee Chair(s):Roca, Alfred L.
Doctoral Committee Member(s):Cardoso, Felipe C.; Loor, Juan J.; Sinha, Saurabh
Department / Program:Animal Sciences
Discipline:Animal Sciences
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):transcription factors
hox genes
endogenous retroviruses
sumatran rhinoceros
Phascolarctos cinereus
koala retrovirus
Abstract:Mammalian genomes consist primarily of non-coding sequences (Kellis et al. 2014). Originally castigated as "junk DNA", many non-coding regions have now been characterized as having functional roles, or have been determined to be the causal agent for diseases. Additionally, sequences that are non-functional can be used as neutral markers for population genetics. Determining the role of non-coding sequences or finding sequences usable as neutral markers is computationally and biologically non-trivial. However, recent advances in molecular biology, in particular the reduced cost of next-generation sequencing (NGS), have enabled new experiments that involve these sequences. I will discuss studies using bioinformatics that leveraged these advances to characterize three types of non-coding sequences: endogenous retroviruses, microsatellite markers and transcription factor binding sites. I conducted the bioinformatics design, coding and analyses, working with collaborators who verified findings in the laboratory. The only retrovirus known to be currently transitioning from exogenous to endogenous form is the koala retrovirus (KoRV), making koalas (Phascolarctos cinereus) ideal for examining the early stages of retroviral endogenization. In the first study, I developed a bioinformatics routine to identify distinct retroviral integrants from NGS reads of KoRV retrovirus flanks isolated using koala genomic DNA. In the second study, I developed computationally efficient, user-friendly software that would identify polymorphic microsatellite loci using NGS reads, then design oligonucleotide primers appropriate for amplifying those loci. We developed this software to enable studies to improve understanding of population structure, estimate population size and estimate genetic diversity in genetically depauperate wildlife species. In the third study, I developed a bioinformatics pipeline to characterize gene expression changes during development in the fetal limb tissue of several mammalian species, to better understand the mechanistic differences across evolutionary lineages. We compared development in four species of mammals. The house mouse was used since it is a well-characterized model organism with five digits. The domestic pig was used since it is a well-studied agricultural animal and a model for digit reduction. A species of bat was used since bats undergo wing development. Finally, a species of opossum was used as an outgroup to the three eutherian species.
Issue Date:2016-04-15
Rights Information:Copyright 2016 Kai Zhao
Date Available in IDEALS:2016-07-07
Date Deposited:2016-05

This item appears in the following Collection(s)

Item Statistics