Files in this item
Files | Description | Format |
---|---|---|
application/pdf ![]() | (no description provided) |
Description
Title: | Mining social media stimulus from news article text using weakly-supervised narrative classification |
Author(s): | Qiu, Wenda |
Advisor(s): | Han, Jiawei |
Department / Program: | Computer Science |
Discipline: | Computer Science |
Degree Granting Institution: | University of Illinois at Urbana-Champaign |
Degree: | M.S. |
Genre: | Thesis |
Subject(s): | Text Mining
News Classification |
Abstract: | To make an accurate simulation for social media, we first need to find the stimulus in external sources. In this work, we model the stimulus mining into a narrative classification task on a news article dataset. The previous state-of-the-art text classification methods can not be directly applied here, mainly due to the following challenges we need to solve: 1) Lack of training data: the given news article data does not have labeling for narratives and we can not afford manual labeling other than a small evaluation set. 2) The complexity in narratives: narratives are defined in a more complex way comparing to the classes used in a classical news classification dataset, which stops us from using existing weakly supervised text classification methods that heavily depend on class name semantics. 3) The noisy news article dataset: the collected dataset does not guarantee the documents will belong to any of the narratives. In such cases, the power of the self-training strategy widely used in existing methods on weak supervision will be limited. To solve these challenges, we proposed a narrative decomposition and re-grouping strategy and a relevance filtering module, to fully utilize the power of weakly supervised classification methods. We conduct extensive experiments on two datasets under the background of real global events and further proposed two ways to combine different results for an optimal stimulus time-series. |
Issue Date: | 2021-04-27 |
Type: | Thesis |
URI: | http://hdl.handle.net/2142/110579 |
Rights Information: | Copyright 2021 Wenda Qiu |
Date Available in IDEALS: | 2021-09-17 |
Date Deposited: | 2021-05 |
This item appears in the following Collection(s)
-
Dissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer Science -
Graduate Dissertations and Theses at Illinois
Graduate Theses and Dissertations at Illinois