Files in this item



application/pdfWANG-DISSERTATION-2017.pdf (2MB)
(no description provided)PDF


Title:Probabilistic latent variable models for knowledge discovery and optimization
Author(s):Wang, Xiaolong
Director of Research:Zhai, Chengxiang
Doctoral Committee Chair(s):Zhai, Chengxiang
Doctoral Committee Member(s):Han, Jiawei; Forsyth, David A; Nedich, Angelia; Zhang, Joy Y
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Probabilistic latent variable models (PLVMs)
Knowledge discovery and optimization
Abstract:I conduct a systematic study of probabilistic latent variable models (PLVMs) with applications to knowledge discovery and optimization. Probabilistic modeling is a principled means to gain insight of data. By assuming that the observed data are generated from a distribution, we can estimate its density, or the statistics of our interest, by either Maximum Likelihood Estimation or Bayesian inference, depending on whether there is a prior distribution for the parameters of the assumed data distribution. One of the primary goals of various machine learning/data mining models is to reveal the underlying knowledge of observed data. A common practice is to introduce latent variables, which are modeled together with the observations. Such latent variables compute, for example, the class assignments (labels), the cluster membership, as well as other unobserved measurements of the data. Besides, proper exploitation of latent variables facilities the optimization itself, which leads to computationally efficient inference algorithms. In this thesis, I describe a range of applications where latent variables can be leveraged for knowledge discovery and efficient optimization. Works in this thesis demonstrate that PLVMs are a powerful tool for modeling incomplete observations. Through incorporating latent variables and assuming that the observations such as citations, pairwise preferences as well as text are generated following tractable distributions parametrized by the latent variables, PLVMs are flexible and effective to discover knowledge in data mining problems, where the knowledge is mathematically modelled as continuous or discrete values, distributions or uncertainty. In addition, I also explore PLVMs for deriving efficient algorithms. It has been shown that latent variables can be employed as a means for model reduction and facilitates the computation/sampling of intractable distributions. Our results lead to algorithms which take advantage of latent variables in probabilistic models. We conduct experiments against state-of-the-art models and empirical evaluation shows that our proposed approaches improve both learning performance and computational efficiency.
Issue Date:2017-04-11
Rights Information:Copyright 2016 Xiaolong Wang
Date Available in IDEALS:2017-08-10
Date Deposited:2017-05

This item appears in the following Collection(s)

Item Statistics