Probabilistic subclonal reconstruction for cancer

Kim, Juho

Probabilistic subclonal reconstruction for cancer

Kim, Juho

Permalink

https://hdl.handle.net/2142/116102

Description

Title

Probabilistic subclonal reconstruction for cancer

Author(s)

Kim, Juho

Issue Date

2022-07-14

Director of Research (if dissertation) or Advisor (if thesis)

Koyejo, Oluwasanmi
El-Kebir, Mohammed

Doctoral Committee Chair(s)

Koyejo, Oluwasanmi
El-Kebir, Mohammed

Committee Member(s)

Milenkovic, Olgica
Shomorony, Ilan
Chia, Nicholas

Department of Study

Electrical & Computer Eng

Discipline

Electrical & Computer Engr

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Cancer genomics
Tumor phylogenetics
Probabilistic modeling
Machine learning

Language

eng

Abstract

Cancer consists of genetically heterogeneous populations of cells that arise through a process of subclonal evolution. Reconstructing the evolutionary processes that give rise to cancer can help us better understand cancer progression and prioritize treatment targets. The subclonal reconstruction of cancer gives us the information about the co-occurrence of mutations within the same subclone, the underlying proportion of cells belonging to each subclone, and the ancestral relationships between them. The evolutionary process can be described by inferring tumor phylogenetic trees. The majority of current approaches focus only on either mutation clustering or tree inference in isolation, or rely on computationally expensive algorithms to holistically consider clustering and tree inference concurrently. In this dissertation, we formalize the problem of reconstructing subclonal structure for cancer via probabilistic modeling. Using variant and total read count obtained from bulk DNA sequencing data as input, we introduce a tree-constrained binomial mixture model and an expectation-maximization (EM) method to estimate the clustering assignment for each mutation and the underlying frequency for each cluster. Our EM algorithm employs a linear programming approach to accurately maximize the likelihood bound subject to tree constraints. We choose the optimal tree topology by repeating the process across all possible tree topologies. Compared to existing work, the resulting ClusTree algorithm more accurately identifies mutation clusters, estimates frequencies for each cluster, and detects the proper tree topology, especially for low-depth sequencing data.

Graduation Semester

2022-08

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/116102

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Electrical and Computer Engineering

Dissertations and Theses in Electrical and Computer Engineering

Probabilistic subclonal reconstruction for cancer

Kim, Juho

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Electrical and Computer Engineering

Log In