Files in this item

FilesDescriptionFormat

application/pdf

application/pdfVAISHNAVISUBRAMANIAN-THESIS-2018.pdf (13MB)
(no description provided)PDF

Description

Title:Multimodal data analysis applied to a medical setting
Author(s):Vaishnavi Subramanian, -
Advisor(s):Do, Minh N.
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):histopathology
image
multimodal
data
spatial
correlation
CCA
medical
genes
TCGA
cancer
sparsity
Abstract:Complex diseases, such as cancer, have traditionally been studied using genetic data, or images alone. To understand the biology of such diseases, joint analysis of multiple data modalities could provide interesting insights. We propose the use of canonical correlation analysis (CCA) as a preliminary discovery tool for identifying connections across modalities, specifically between gene expression and features describing cell and nucleus shape, texture, and stain intensity in histopathological images. It is also important to capture the interaction between different types of cells, an important indicator of disease status. To that end, it is crucial to quantify and utilize the spatial distribution of various cell types within the examined tissue at different scales. We employ Ripley's K-statistic, a traditional feature employed in geographical information systems, which captures spatial distribution patterns of individual point sets and interactions between multiple point sets. We propose to improve the histopathology image features by incorporating this descriptor to capture the spatial distribution of the cells, and interactions between lymphocytes and epithelial cells. Applied to 615 breast cancer samples from The Cancer Genome Atlas, CCA revealed significant correlation of 0.736 (p approx 1e-14) and 0.471, (p approx 7e-3) for CCA and Sparse CCA, respectively, of several image features with expression of PAM50 genes, known to be linked to outcome. Sparse CCA, an extension of CCA based on sparsity, revealed associations with enrichment of pathways implicated in cancer without leveraging prior biological understanding. The utility of the Ripley's K-statistic on 710 TCGA breast invasive carcinoma (BRCA) patients' histopathology images in the context of imaging-genetics is demonstrated by its superior correlations with gene expressions. These findings affirm the utility of CCA for joint phenotype-genotype analysis of cancer, and the importance of capturing spatial features at multiple scales.
Issue Date:2018-04-16
Type:Text
URI:http://hdl.handle.net/2142/100986
Rights Information:Copyright 2018 - Vaishnavi Subramanian
Date Available in IDEALS:2018-09-04
Date Deposited:2018-05


This item appears in the following Collection(s)

Item Statistics