Files in this item

FilesDescriptionFormat

application/pdf

application/pdfGONG-DISSERTATION-2020.pdf (4MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:Representation learning of natural language and its application to language understanding and generation
Author(s):Gong, Hongyu
Director of Research:Bhat, Suma
Doctoral Committee Chair(s):Bhat, Suma
Doctoral Committee Member(s):Viswanath, Pramod; Srikant, Rayadurgam; Hwu, Wen-mei; Fanti, Giulia
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Natural Language Processing
Representation Learning
Language Understanding
Language Generation
Abstract:How to properly represent language is a crucial and fundamental problem in Natural Language Processing (NLP). Language representation learning aims to encode rich information such as the syntax and semantics of the language into dense vectors. It facilitates the modeling, manipulation and analysis of natural language in computational linguistics. Existing algorithms utilize corpus statistics such as word co-occurrences to learn general-purpose language representation. Recent advances in generic representation integrate intensive information such as contextualized features from unlabeled text corpora. In this dissertation, we continue this line of research to incorporate rich knowledge into generic embeddings. We show that word representation could be enriched with various information including temporal and spatial variations as well as syntactic functionalities, and that text representation could be refined with topical knowledge. Moreover, we develop an insight into the geometry of pre-trained representation, and connect it to the semantic understanding such as identifying the idiomatic word usage. Besides generic representation, task-dependent representation is also extensively studied in downstream applications, where the representation is trained to encode domain information from labeled datasets. This dissertation leverages the capability of neural network models to integrate the task-specific supervision into language representations. We introduce new deep learning models and algorithms to train representations with external knowledge in annotated data. It is shown that the learned representation can assist in various downstream tasks in language understanding such as text classification and language generation such as text style transfer.
Issue Date:2020-04-15
Type:Thesis
URI:http://hdl.handle.net/2142/108110
Rights Information:copyright 2020 Hongyu Gong
Date Available in IDEALS:2020-08-26
Date Deposited:2020-05


This item appears in the following Collection(s)

Item Statistics