Files in this item



application/pdfLIN-DISSERTATION-2020.pdf (5MB)Restricted to U of Illinois
(no description provided)PDF


Title:Multilingual multitask joint neural information extraction
Author(s):Lin, Ying
Director of Research:Ji, Heng
Doctoral Committee Chair(s):Ji, Heng
Doctoral Committee Member(s):Han, Jiawei; Zhai, ChengXiang; Roth, Dan; Stoyanov, Veselin
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Joint Information Extraction
Multitask Learning
Abstract:In the age of information overload, the ability to automatically extract useful structured information from texts is urgently needed by a wide range of applications, such as information retrieval and question answering. Over the past decades, researchers have proposed various Information Extraction (IE) techniques to discover important knowledge elements (e.g., entities, relations, events) from unstructured documents. However, as these approaches typically rely on specific hand-crafted rules or manually annotated data, it is usually expensive to adapt them for new settings, such as new languages, domains, scenarios, or genres. Therefore, the goal of this thesis is to develop more robust and portable models for information extraction tasks and build a joint neural architecture that performs multiple IE tasks within a single model. We first focus on the generality of IE models. As most existing neural models use word embeddings as input features, they are sensitive to the quality of word representations. We investigate the possible factors that cause performance degradation when applying a name tagger to new data and tackle this issue from two aspects: 1. Robustness. The reliability and the amount of information of each feature is inconsistent among words. We incorporate reliability signals and dynamic feature composition to enable to model to select reliable and effective features. 2. Generality. Overfitting is a major problem leading to the huge performance gap between seen and unseen names. As a solution, we encourage the model to leverage contextual features that are more general. Next, we explore the portability of models for sequence labeling, the underlying problem of many Natural Language Processing (NLP) tasks such as name tagging. Current models cannot be applied to very dissimilar settings (e.g., other languages), whereas annotating new data for all possible settings is infeasible. Hence, we propose to transfer knowledge across different models through multitask learning to reduce the need for data annotation. To maximize the knowledge being transferred, we design a unified and extendable architecture that integrates multiple transfer approaches. After that, we extend this framework to more IE tasks and propose a joint neural architecture, OneIE, that performs multilingual entity, relation, and event extraction simultaneously. In addition to multitask learning, we further incorporate global features to capture the cross-subtask and cross-instance interactions among knowledge elements. Finally, we propose OneIE to perform joint inference without using additional global features.
Issue Date:2020-12-02
Rights Information:Copyright 2020 Ying Lin
Date Available in IDEALS:2021-03-05
Date Deposited:2020-12

This item appears in the following Collection(s)

Item Statistics