Withdraw
Loading…
Annotation-free location mention mining from text corpora
Liu, Tingcong
Content Files

Loading…
Download Files
Loading…
Download Counts (All Files)
Loading…
Edit File
Loading…
Permalink
https://hdl.handle.net/2142/125538
Description
- Title
- Annotation-free location mention mining from text corpora
- Author(s)
- Liu, Tingcong
- Issue Date
- 2024-06-27
- Director of Research (if dissertation) or Advisor (if thesis)
- Han, Jiawei
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Text Mining
- Pre-trained Language Model
- Abstract
- This thesis provides a novel framework to extract location mentions from text corpora. Location mention mining plays an important role at analyzing and extracting structured knowledge from real-world text corpora like news and social media. Existing methods mainly rely on NER models or semantic parsers to extract locations but suffer from the following problems: (a) Entities tagged by NER models as LOC or GPE may not represent locations in the context. For example, in the sentence S1: “Ukraine forces are approaching Russia-held Kherson”, only “Kherson” is the true location mention although “Ukraine” and “Russia” are also of GPE type; and (b) A semantic parser cannot recognize locations in verb phrases. In S1, although “Kherson” refers to a location, it cannot be extracted as a locative argument by a semantic parser because it does not follow a preposition. This thesis defines a new task, location mention mining, aiming at extracting from a corpus all the mentions corresponding to real-world locations based on the context, and propose an annotation-free method, LocMine, which (1) constructs location-indicative term repositories using a background corpus and a knowledge base, (2) extracts and mines context-free location mentions based on the repositories, and (3) classifies context-dependent location mentions with pre-trained language models. This thesis provides extensive experiments and case studies showing that LocMine achieves the best performance among all the compared methods in terms of the ability to mine a complete set of location mentions from real-world corpora.
- Graduation Semester
- 2024-08
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/125538
- Copyright and License Information
- Copyright 2024 Tingcong Liu
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…