Bayesian sparsity learning with variational automatic relevance determination

Liu, Zihe

Bayesian sparsity learning with variational automatic relevance determination

Liu, Zihe

Permalink

https://hdl.handle.net/2142/127191

Description

Title

Bayesian sparsity learning with variational automatic relevance determination

Author(s)

Liu, Zihe

Issue Date

2024-11-18

Director of Research (if dissertation) or Advisor (if thesis)

Liang, Feng

Doctoral Committee Chair(s)

Liang, Feng

Committee Member(s)

Chen, Yuguo
Liu, Jingbo
Yang, Yun

Department of Study

Statistics

Discipline

Statistics

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Sparsity
Automatic Relevance Determination
Em- pirical Bayes
Variational Technique
High-dimensional Linear Regression
Generalized Additive Model
Convergence

Abstract

Automatic Relevance Determination (ARD) is a well-regarded Bayesian approach for feature selection, where each feature’s relevance is encoded in a hyper-parameter that is automatically tuned through the data. However, estimating the ARD prior via the evidence function poses significant computational challenges, with no closed-form solution and scalability issues. Existing ARD research primarily focuses on algorithm development, with limited theoretical understanding of its properties. In this thesis, we introduce Variational Automatic Relevance Determination (VARD), a novel approach that estimates the ARD prior efficiently through a variational method. We examine the statistical properties of VARD in the context of high-dimensional linear regression, providing convergence guarantees for both parameter estimation and variable selection. Additionally, we extend the VARD framework to additive models, enabling simultaneous estimation of smoothness and relevance for each feature. The first part of this thesis studies the ARD procedure within high-dimensional linear regression under sparsity assumptions. Our proposed VARD method approximates the posterior distribution with independent Gaussian distributions for each regression coefficient, where some distributions converge to a point mass at zero, automatically excluding irrelevant variables. We establish convergence results and present an efficient coordinate descent algorithm to implement VARD, demonstrating its empirical performance on simulated datasets. In the second part, we extend VARD to sparse additive models in high-dimensional settings. VARD uniquely enables independent smoothness estimation for each feature, distinguishing whether a feature’s effect on the response is zero, linear, or nonlinear. An efficient coordinate descent algorithm further supports this implementation. Empirical evaluations on simulated and real-world data highlight VARD’s advantages over alternative variable selection methods for additive models.

Graduation Semester

2024-12

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/127191

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Bayesian sparsity learning with variational automatic relevance determination

Liu, Zihe

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In