Files in this item



application/pdfJACOBS-THESIS-2016.pdf (763kB)
(no description provided)PDF


Title:Knowing a thing is "a thing": The use of acoustic features in multiword expression extraction
Author(s):Jacobs, Cassandra L.
Advisor(s):Fleck, Margaret
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Speech processing
Language models
Abstract:Speakers of a language need to have complex linguistic representations for speaking, often on the level of non-literal, idiomatic expressions like black sheep. Typically, datasets of these so-called multiword expressions come from hand-crafted ontologies or lexicons, because identifying expressions like these in an unsupervised manner is still an unsolved problem in natural language processing. In this thesis I demonstrate that prosodic features, which are helpful in parsing syntax and interpreting meaning, can also be used to identify multiword expressions. To do this, I extracted noun phrases from the Buckeye corpus, which contains spontaneous spoken language, and matched these noun phrases to page titles in Wikipedia, a massive, freely available encyclopedic ontology of entities and phenomena. By incorporating prosodic features into a model that distinguishes between multiword expressions that are found in Wikipedia titles and those that are not, we see increases in classifier performance that suggests that prosodic cues can help with the automatic extraction of multiword expressions from spontaneous speech, helping models and potentially listeners decide whether something is "a thing" or not.
Issue Date:2016-07-19
Rights Information:Copyright 2016 Cassandra Jacobs
Date Available in IDEALS:2016-11-10
Date Deposited:2016-08

This item appears in the following Collection(s)

Item Statistics