Files in this item

FilesDescriptionFormat

application/pdf

application/pdfWANG-DISSERTATION-2016.pdf (2MB)
(no description provided)PDF

Description

Title:Scalable algorithms for Bayesian variable selection
Author(s):Wang, Jin
Director of Research:Liang, Feng
Doctoral Committee Chair(s):Liang, Feng
Doctoral Committee Member(s):Marden, John I.; Ji, Yuan; Zhao, Dave; Park, Trevor
Department / Program:Statistics
Discipline:Statistics
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Variable Selection
EM
Ensemble
Variational Bayes
Asymptotic Analysis
Logistic model
Abstract:The innovation of modern technologies drives research and development on high-dimensional data analysis in diverse fields, where variable selection plays a pivotal role to ensure credible model estimation. We focus on scalable algorithms for variable selection that can handle large data sets. Firstly, we propose an EM algorithm that returns the MAP estimate of the set of relevant variables. Due to its particular updating scheme, our algorithm can be implemented efficiently. We also show that the MAP estimate returned by our EM algorithm achieves variable selection consistency. In practice, EM algorithm tends to get stuck at local peaks. So we propose an ensemble version: repeatedly apply the EM algorithm on a subset of Bootstrap sample data and then aggregate the results. Empirical studies demonstrate the superior performance of this Bayesian Bootstrap EM algorithm. Secondly, we propose a hybrid computation framework for Bayesian variable selection. This new algorithm SAB is a combination of the classical EM algorithm and the variational Bayes algorithm. It is very fast in handling high dimensional data with a large number of covariates. To address a critical biological problem, we apply SAB to a state-of-art cancer genomics data set with a goal to understand the complex regulatory relationship between miRNAs and mRNAs in cancer. In the third part, we study the asymptotic behavior of the SAB algorithm in detail and prove that SAB achieves the selection consistency, Bayesian consistency and also an oracle property when the number of covariates grows with the sample size exponentially. Lastly, we extend the hybrid framework of Bayesian variable selection to logistic models, where we adopt the Polya-Gamma specification and show that this specification is equivalent as the local approximation method in the variational Bayes framework.
Issue Date:2016-07-14
Type:Thesis
URI:http://hdl.handle.net/2142/92827
Rights Information:Copyright 2016 Jin Wang
Date Available in IDEALS:2016-11-10
Date Deposited:2016-08


This item appears in the following Collection(s)

Item Statistics