Withdraw
Loading…
Annotation-free location mention mining from text corpora
Liu, Tingcong
Loading…
Permalink
https://hdl.handle.net/2142/125538
Description
- Title
- Annotation-free location mention mining from text corpora
- Author(s)
- Liu, Tingcong
- Issue Date
- 2024-06-27
- Director of Research (if dissertation) or Advisor (if thesis)
- Han, Jiawei
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Text Mining
- Pre-trained Language Model
- Language
- eng
- Abstract
- This thesis provides a novel framework to extract location mentions from text corpora. Location mention mining plays an important role at analyzing and extracting structured knowledge from real-world text corpora like news and social media. Existing methods mainly rely on NER models or semantic parsers to extract locations but suffer from the following problems: (a) Entities tagged by NER models as LOC or GPE may not represent locations in the context. For example, in the sentence S1: “Ukraine forces are approaching Russia-held Kherson”, only “Kherson” is the true location mention although “Ukraine” and “Russia” are also of GPE type; and (b) A semantic parser cannot recognize locations in verb phrases. In S1, although “Kherson” refers to a location, it cannot be extracted as a locative argument by a semantic parser because it does not follow a preposition. This thesis defines a new task, location mention mining, aiming at extracting from a corpus all the mentions corresponding to real-world locations based on the context, and propose an annotation-free method, LocMine, which (1) constructs location-indicative term repositories using a background corpus and a knowledge base, (2) extracts and mines context-free location mentions based on the repositories, and (3) classifies context-dependent location mentions with pre-trained language models. This thesis provides extensive experiments and case studies showing that LocMine achieves the best performance among all the compared methods in terms of the ability to mine a complete set of location mentions from real-world corpora.
- Graduation Semester
- 2024-08
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/125538
- Copyright and License Information
- Copyright 2024 Tingcong Liu
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Siebel School of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…