Title:The two targets of speech production: Two levels of specification
Author(s):He, Mingkan
Contributor(s):Levinson, Stephen
Degree:B.S. (bachelor's)
Subject(s):speech synthesizer
articulatory phonology
Abstract:A thread of this work is the difference in how articulatory and perceptual features of phonology are integrated into speech production. This idea emerges during the research and simulation of appropriate speech synthesizers for one fellow graduate student to adopt in a proposal for development of an automatic speech acquisition system. While using available synthesizers to produce speech utterances, it occurs that these two features actually determine the two levels of input to speech synthesizers, namely, tasks and muscle activities. This thesis adheres to two existing models, each accepting one level of input. The TADA approach [1] maintains that input to a speech synthesizer should be tasks, which consist of specifications of tract variables, such as locations and degrees of constrictions, as functions of time. To be more precise, these tasks are given the name of gestural scores, which is explained in the paper later. On the other hand, Praat [2] takes as input muscle activities: the articulatory input specifications initially control the lengths and tensions of the muscles, instead of the positions of articulators. After making a brief introduction to the above two speech synthesizers, assessment and comparison of their time efficiencies as well as perceptual accuracies are provided (in Part I and Part II), by confronting simulated results from each of them with sounds in the real world. In the end of both parts, suggestion is offered on which category of synthesizers should be adopted with respect to various aspects of research concentrations in articulatory phonology.
Issue Date:2016-05
Date Available in IDEALS:2016-08-26

