Files in this item

FilesDescriptionFormat

application/pdf

application/pdfYIN-DISSERTATION-2020.pdf (5MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:Bayesian variable selection in high dimensional censored regression models
Author(s):Yin, Wenjing
Director of Research:Liang, Feng; Narisetty, Naveen Naidu
Doctoral Committee Chair(s):Liang, Feng; Narisetty, Naveen Naidu
Doctoral Committee Member(s):Zhao, Sihai Dave; Douglas, Jeffrey A
Department / Program:Statistics
Discipline:Statistics
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Bayesian Variable Selection
Survival Analysis
Censored Regression Models
Spike-and-slab Priors
Abstract:The development in technologies drives research in variable selection in various fields, especially in bio-medical areas where high-dimensional gene expression data are present. Various approaches have been developed for associating patients' data with patients' survival times, however, not many can deal with high-dimensional data while being able to handle censoring. We focus on developing scalable algorithms for variable selection problem in a high-dimensional censored regression model that can handle gene expression data with hundreds of thousands of features. We propose an EM-like iterative algorithm for accelerated failure models (AFT models) with censored survival data under no distributional assumption. Unlike existing methods, the proposed method is able to handle high-dimensional variable selection with less stringent assumption which adopts the scalable feature from Bayesian framework with a continuous spike-and-slab prior specification. We show that the method can be further extended to bivariate survival data by assuming independence between the two events such that the connections between events are carried in the join prior distribution of the unknown coefficient matrix. Lastly, we work with a relatively new regression model, named Restricted Mean Survival Times (RMST) regression models, targeting at variable selection problem when the proportional hazards assumption is invalid and prediction problems for RMST. With a spike-and-slab Lasso prior, we transform the Bayesian variable selection framework of the RMST regression model to a penalized logistic regression problem with a proper choice of the link function. The proposed work can be treated as a more generalized variable selection method comparing to either Cox model or AFT model when the true model is misspecified.
Issue Date:2020-11-30
Type:Thesis
URI:http://hdl.handle.net/2142/109509
Rights Information:Copyright 2020 Wenjing Yin
Date Available in IDEALS:2021-03-05
Date Deposited:2020-12


This item appears in the following Collection(s)

Item Statistics