Weakly supervised aspect extraction for domain-specific texts
Guo, Fang
Loading…
Permalink
https://hdl.handle.net/2142/108548
Description
Title
Weakly supervised aspect extraction for domain-specific texts
Author(s)
Guo, Fang
Issue Date
2020-07-24
Director of Research (if dissertation) or Advisor (if thesis)
Han, Jiawei
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Aspect Extraction
Weakly-supervised
Abstract
Aspect extraction, identifying aspects of text segments from a pre-defined set of aspects, is one of the keystones in text understanding. It benefits numerous applications, including sentiment analysis and product review summarization. Most existing aspect extraction methods heavily rely on human-curated aspect annotations of massive text segments, thus making them expensive to be applied in specific domains. Recent attempts leveraging clustering methods can alleviate such annotation effort, but they require domain-specific knowledge and effort to further filter, aggregate, and align the clustering results to desired aspects. Therefore, in this paper, we explore to extract aspects from the domain-specific raw texts with very limited supervision – only a few user-provided seed words per each aspect. Specifically, our proposed neural model is equipped with multi-head attention and self-training. The multi-head attention is learned from the seed words to ensure that the aspect-related words in text segments are weighted higher than those unrelated ones. The self-training mechanism provides more pseudo labels in addition to limited supervision. Extensive experiments on real-world datasets demonstrate the superior performance of our proposed framework, as well as the effectiveness of both the attention module and the self-training mechanism. Case studies on the attention weights further shed lights on the interpretability of our aspect extraction results.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.