Files in this item

FilesDescriptionFormat

application/pdf

application/pdfWANG-THESIS-2015.pdf (857kB)
(no description provided)PDF

Description

Title:Information-based Event Coreference
Author(s):Wang, Ruichen
Advisor(s):Roth, Dan
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):Event Coreference
Easy-First Clustering
Event Argument
Abstract:Event Coreference is an important module in the event extraction task, which has been shown to be difficult to solve. The goal is to link mentions talking about the same event together so that the information could be aggregated. This task could further be split into two slightly different subtasks: Within-Doc Event Coreference and Cross-Doc Event Coreference. Most of the related publications tried to solve the problem of Event Coreference in a two-step manner: Train or design a similarity metric for event mention pairs, then apply some clustering algorithm to the event mention space using the similarity metric as distance. In this work, we identify two major problems people have neglected: One is that coreference does not imply full event mention similarity due to the fact that event mentions tend to contain partial and even complementary information. The other problem is that the order to compare event mentions pair could be important, because instead of comparing event mentions pairs that have incomplete and trustless information, comparing those who have complete and trustworthy information first could prune the error rate. We propose Core Similarity, a new argument-based similarity metric, to solve the first problem, and two information-based clustering algorithms for the second problem - Informative-First Clustering (IFC) for within-doc situation and Topic-Side Event Clustering (TSEC) for cross-doc situation. These clustering algorithms are based on the idea of Event Information which is defined in this work. Finally, the EVCO system is delivered with all of these details implemented.
Issue Date:2015-12-10
Type:Thesis
URI:http://hdl.handle.net/2142/89094
Rights Information:Copyright 2015 by Ruichen Wang
Date Available in IDEALS:2016-03-02
Date Deposited:2015-12


This item appears in the following Collection(s)

Item Statistics