Files in this item



application/pdfCHEN-DISSERTATION-2019.pdf (4MB)Restricted to U of Illinois
(no description provided)PDF


Title:Modeling phones, keywords, topics and intents in spoken languages
Author(s):Chen, Wenda
Director of Research:Hasegawa-Johnson, Mark
Doctoral Committee Chair(s):Hasegawa-Johnson, Mark
Doctoral Committee Member(s):Li, Haizhou; Levinson, Stephen E.; Varshney, Lav R.
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Speech recognition
Low-resource languages
Transfer learning
Distant Supervision, Spoken language understanding
Spoken term detection
Abstract:Spoken Language Understanding for both rich-resource languages (RRL) and low-resource languages (LRL) is an important research area for academia and the commercial world. In the conversational situations where either the language used in speech is a minority one, or the environment is noisy, barriers will emerge between the communicators. Essentially, people would like to understand the basic components of any language spoken by others who they meet in their daily lives. On the other hand, machines can also be trained to learn the process of modeling the basic language components such as phones, keywords, topics and intents during both human/machine interactions and human/human communications. Eventually, if we can develop a machine assistant for people to understand the basic meaning of any language in speech, we could make the human world much more efficient and harmonious. This thesis addresses the problem with the help of mismatched-crowdsourcing- based distant supervision, linguistic knowledge, and corpus-based transfer learning. First we analyze the usefulness of mismatched transcripts and distinctive features, and then propose phone recognition based on the optimized inference of the phone set in the low-resource language from the clustering of the mismatched transcripts. Subsequently, the keyword discovery from the phone-level results is explored. The topic information collected in the corpus is then used as the additional knowledge for topic classification and further improving phone recognition. Based on the keyword sequence, the intents of the speaker are also eventually obtained. The experimental results show that with the help of data collection design and existing knowledge, we can achieve reasonably good machine language understanding for languages whose phones, keywords, topics, and intents were not learned before. This work will lead to further investigations in the area of spoken language understanding in any language.
Issue Date:2019-07-02
Rights Information:Copyright 2019 Wenda Chen
Date Available in IDEALS:2019-11-26
Date Deposited:2019-08

This item appears in the following Collection(s)

Item Statistics