Withdraw
Loading…
Novel computational methods for discordance-aware phylogenomic analysis
Tabatabaee, Seyedeh Yasamin
Loading…
Permalink
https://hdl.handle.net/2142/132481
Description
- Title
- Novel computational methods for discordance-aware phylogenomic analysis
- Author(s)
- Tabatabaee, Seyedeh Yasamin
- Issue Date
- 2025-11-11
- Director of Research (if dissertation) or Advisor (if thesis)
- Warnow, Tandy
- Doctoral Committee Chair(s)
- Warnow, Tandy
- Committee Member(s)
- El-Kebir, Mohammed
- Gropp, William
- Liu, Ge
- Mirarab, Siavash
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- phylogenetics
- phylogenomics
- gene tree discordance
- species tree estimation
- multi-species coalescent
- Abstract
- Inferring the evolutionary history of a set of species is a key step in many biological and medical research projects, as species trees provide a context in which problems in comparative genomics, biodiversity, phylogeography and epidemiology can be addressed. Recent advances in sequencing technologies have led to an increasing availability of genome-scale data, and today phylogenomics projects construct species trees using hundreds to thousands of loci, potentially whole genomes. However, species tree estimation from multi-locus datasets presents several statistical and computational challenges, as most problems in this area are NP-hard. Also, due to a phenomenon known as “gene tree heterogeneity”, different locations within the genome of a species can evolve differently due to biological processes such as incomplete lineage sorting, gene duplication and loss and horizontal gene transfer, that further complicate species tree estimation. Despite advances in developing methods that can estimate an unrooted and non-parameterized topology of a species tree in the presence of gene tree discordance, less attention has been paid to estimating the root location, quantifying branch lengths in units that are usable for downstream analysis, and estimating divergence times. All of these are necessary for many applications of phylogenomics, such as constructing the tree of life and analyzing the origins of diseases, such as HIV and COVID-19. In this dissertation, we introduce new computational methods developed for these tasks, collectively referred to as "post-species tree analysis", that address different sources of gene tree discordance. For these methods, we present rigorous theoretical results including proofs of statistical consistency, sample complexity, and running time analyses, as well as extensive empirical results on simulated and biological datasets ranging from the root of the tree of life to recent speciations. Overall, these methods provide high accuracy and scalability for estimating the root, branch lengths and divergence times in the presence of gene tree discordance, and some are accompanied with strong theoretical guarantees.
- Graduation Semester
- 2025-12
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/132481
- Copyright and License Information
- Copyright 2025 Seyedeh Yasamin Tabatabaee
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…