Files in this item
Files  Description  Format 

application/pdf Tang_Hao.pdf (3MB)  (no description provided) 
Description
Title:  Onevector representations of stochastic signals for pattern recognition 
Author(s):  Tang, Hao 
Director of Research:  Huang, Thomas S. 
Doctoral Committee Chair(s):  Huang, Thomas S. 
Doctoral Committee Member(s):  Levinson, Stephen E.; HasegawaJohnson, Mark A.; Ouyang, Yanfeng 
Department / Program:  Electrical & Computer Eng 
Discipline:  Electrical & Computer Engr 
Degree Granting Institution:  University of Illinois at UrbanaChampaign 
Degree:  Ph.D. 
Genre:  Dissertation 
Subject(s):  Pattern Recognition
Stochastic Signal OneVector Representation Hidden Markov Model 
Abstract:  When building a pattern recognition system, we primarily deal with stochastic signals such as speech, image, video, and so forth. Often, a stochastic signal is ideally of a onevector form so that it appears as a single data point in a possibly highdimensional representational space, as the majority of pattern recognition algorithms by design handle stochastic signals having a onevector representation. More importantly, a onevector representation naturally allows for optimal distance metric learning from the data, which generally accounts for significant performance increases in many pattern recognition tasks. This is motivated and demonstrated by our work on semisupervised speaker clustering, where a speech utterance is represented by a Gaussian mixture model (GMM) mean supervector formed based on the component means of a GMM that is adapted from a universal background model (UBM) which encodes our prior knowledge of speakers in general. Combined with a novel distance metric learning technique that we propose, namely linear spherical discriminant analysis, which performs discriminant analysis in the cosine space, the GMM mean supervector representation of utterances leads to the stateoftheart speaker clustering performance. Noting that the main criticism of the GMM mean supervector representation is that it assumes independent and identically distributed feature vectors, which is far from true in practice, we propose a novel onevector representation of stochastic signals based on adapted ergodic hidden Markov models (HMMs) and a novel onevector representation of stochastic signals based on adapted lefttoright HMMs. In these onevector representations, a single vector is constructed based on a transformation of the parameters of an HMM that is adapted from a UBM by various controllable degrees, where the transformation is mathematically derived based on an upper bound approximation of the KullbackLeibler divergence rate between two adapted HMMs. These onevector representations possess a set of very attractive properties and are rather generic in nature, so they can be used with various types of stochastic signals (e.g. speech, image, video, etc.) and applied to a broad range of pattern recognition tasks (e.g. classification, regression, etc.). In addition, we propose a general framework for onevector representations of stochastic signals for pattern recognition, of which the proposed onevector representations based on adapted ergodic HMMs and adapted lefttoright HMMs respectively are two special cases. The general framework can serve as a unified and principled guide for constructing ``the best'' onevector representations of stochastic signals of various types and for various pattern recognition tasks. Based on different types of underlying statistical models carefully and cleverly chosen to best fit the nature of the stochastic signals, ``the best'' onevector representations of the stochastic signals may be constructed by a possibly nonlinear transformation of the parameters of the underlying statistical models which are learned from the stochastic signals, where the transformation may be mathematically derived from a properly chosen distance measure between two statistical models that has an elegant root in the KullbackLeibler theory. Since most work in this dissertation is based on HMMs, we contribute to this fascinating tool via proposing a new maximum likelihood learning algorithm for HMMs, which we refer to as the boosting BaumWelch algorithm. In the proposed boosting BaumWelch algorithm, we formulate the HMM learning problem as an incremental optimization procedure which performs a sequential gradient descent search on a loss functional for a good fit in an inner product function space. The boosting BaumWelch algorithm can serve as an alternative maximum likelihood learning algorithm for HMMs to the traditional BaumWelch or expectationmaximization (EM) algorithm, and a preferred method for use in situations where there is insufficient training data available. Compared to the traditional BaumWelch or EM algorithm, the boosting BaumWelch algorithm is less susceptible to the overfitting problem (known as a general property of maximum likelihood estimation techniques) in that the boosting BaumWelch algorithm has a tendency to produce a ``large margin'' effect. 
Issue Date:  20110121 
URI:  http://hdl.handle.net/2142/18595 
Rights Information:  Copyright 2010 Hao Tang 
Date Available in IDEALS:  20110121 20130108 
Date Deposited:  201012 
This item appears in the following Collection(s)

Graduate Dissertations and Theses at Illinois
Graduate Theses and Dissertations at Illinois 
Dissertations and Theses  Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer Engineering