Files in this item

FilesDescriptionFormat

application/pdf

application/pdfRajhans_Samdani.pdf (2MB)
(no description provided)PDF

Description

Title:Algorithms for structural learning with decompositions
Author(s):Samdani, Rajhans
Director of Research:Roth, Dan
Doctoral Committee Chair(s):Roth, Dan
Doctoral Committee Member(s):Forsyth, David A.; Hasegawa-Johnson, Mark A.; Joachims, Thorsten
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Machine Learning
Structured Output Prediction
Natural Language Processing
Abstract:Structured prediction describes problems which involve predicting multiple output variables with expressive and complex interdependencies and constraints. Learning over expressive structures (called structural learning) is usually time-consuming as exploring the structured space can be an intractable problem. The goal of this thesis is to present different techniques for structural learning, which learn by decomposing the problem space into simpler and tractable components. We will consider three different settings: fully supervised, unsupervised and semi-supervised, and discriminative latent variable setting, and present learning techniques for each case. For supervised structural learning, we describe a paradigm called Decomposed Learning (DecL) which decomposes the inference procedure during learning into small inference steps using additional application specific information. For unsupervised learning, we propose a family of Expectation Maximization [Dempster et al., 1977] algorithms called Unified Expectation Maximization (UEM) [Samdani et al., 2012a] that covers several seemingly divergent versions of EM e.g. hard EM. To efficiently add domain-specific declarative constraints into learning, we use a dual projected subgradient ascent algorithm which naturally decomposes the task into simpler components. In the discriminative latent variable scenario, we present a supervised latent variable model for clustering called the Latent Left-Linking Model (L3M) that can efficiently cluster items arriving in a streaming order. We decompose the learning process for L3M into small and efficient stochastic gradient descent steps that lead to rapid convergence.
Issue Date:2014-01-16
URI:http://hdl.handle.net/2142/46596
Rights Information:Copyright 2013 Rajhans Samdani
Date Available in IDEALS:2014-01-16
Date Deposited:2013-12


This item appears in the following Collection(s)

Item Statistics