Files in this item



application/pdfZHANG-THESIS-2015.pdf (556kB)
(no description provided)PDF


Title:Probabilistic generative modeling of speech
Author(s):Zhang, Yang
Advisor(s):Hasegawa-Johnson, Mark A.
Department / Program:Electrical & Computer Engineering
Discipline:Electrical & Computer Engineering
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Probabilistic acoustic tube
speech modeling
speech analysis
generative model
Abstract:Speech processing refers to a set of tasks that involve speech analysis and synthesis. Most speech processing algorithms model a subset of speech parameters of interest and blur the rest using signal processing techniques and feature extraction. However, evidence shows that many speech parameters can be more accurately estimated if they are modeled jointly; speech synthesis also benefits from joint modeling. This thesis proposes a probabilistic generative model for speech called the Probabilistic Acoustic Tube (PAT). The highlights of the model are threefold. First, it is among the very first works to build a complete probabilistic model for speech. Second, it has a well-designed model for the phase spectrum of speech, which has been hard to model and often neglected. Third, it models the AM-FM effects in speech, which are perceptually significant but often ignored in frame-based speech processing algorithms. Experiment shows that the proposed model has good potential for a number of speech processing tasks.
Issue Date:2015-11-24
Rights Information:Copyright 2015 Yang Zhang
Date Available in IDEALS:2016-03-02
Date Deposited:2015-12

This item appears in the following Collection(s)

Item Statistics