This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/116102
Description
Title
Probabilistic subclonal reconstruction for cancer
Author(s)
Kim, Juho
Issue Date
2022-07-14
Director of Research (if dissertation) or Advisor (if thesis)
Koyejo, Oluwasanmi
El-Kebir, Mohammed
Doctoral Committee Chair(s)
Koyejo, Oluwasanmi
El-Kebir, Mohammed
Committee Member(s)
Milenkovic, Olgica
Shomorony, Ilan
Chia, Nicholas
Department of Study
Electrical & Computer Eng
Discipline
Electrical & Computer Engr
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Cancer genomics
Tumor phylogenetics, Probabilistic modeling
Machine learning
Abstract
Cancer consists of genetically heterogeneous populations of cells that arise through a process of subclonal evolution. Reconstructing the evolutionary processes that give rise to cancer can help us better understand cancer progression and prioritize treatment targets. The subclonal reconstruction of cancer gives us the information about the co-occurrence of mutations within the same subclone, the underlying proportion of cells belonging to each subclone, and the ancestral relationships between them. The evolutionary process can be described by inferring tumor phylogenetic trees. The majority of current approaches focus only on either mutation clustering or tree inference in isolation, or rely on computationally expensive algorithms to holistically consider clustering and tree inference concurrently.
In this dissertation, we formalize the problem of reconstructing subclonal structure for cancer via probabilistic modeling. Using variant and total read count obtained from bulk DNA sequencing data as input, we introduce a tree-constrained binomial mixture model and an expectation-maximization (EM) method to estimate the clustering assignment for each mutation and the underlying frequency for each cluster. Our EM algorithm employs a linear programming approach to accurately maximize the likelihood bound subject to tree constraints. We choose the optimal tree topology by repeating the process across all possible tree topologies. Compared to existing work, the resulting ClusTree algorithm more accurately identifies mutation clusters, estimates frequencies for each cluster, and detects the proper tree topology, especially for low-depth sequencing data.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.