Files in this item



application/vnd.openxmlformats-officedocument.presentationml.presentationposter_shub_jana_sail.pptx (6MB)
Poster file PPTX format [Editable]Microsoft PowerPoint 2007


application/pdfposter_shub_jana_sail.pdf (881kB)
Poster file PDF format [Non Editable]PDF


Title:Incremental sentiment prediction based on human in the loop learning
Author(s):Mishra, Shubhanshu; Diesner, Jana; Tao, Liang; Surbeck, Elizabeth; Byrne, Jason
Subject(s):sentiment analysis
incremental learning
machine learning
Abstract:Predicting various types of sentiment for textual data sources is an intensely studied problem in linguistics and computing. Despite progress with computational solutions beyond deterministic lookup dictionaries, practitioners often use dictionary-based, off the shelf tools in order to classify consumer products as being perceived in a positive, negative or neutral way. Advanced scientific solutions have moved far beyond this approach; providing probabilistic solutions that consider a variety of lexical, syntactic and surface form features. How useful are such scientifically rigorously built and tested solutions for practitioners? We provide an answer to this question in three ways: First, by comparing the results from a solution we built via supervised learning to the predictions from a main commercial benchmark tool. For this, we were provided with about ~25K of unique tweets that were hand-tagged by subject matters experts from a large corporation from the food industry sector. Our solution builds upon existing approaches for sentiment classification of twitter data, including suitable parts of speech tagging. We achieve about 80% accuracy (F1-score). Second, by having practitioners use our solution to classify their consumer product data and provide feedback on usability, accuracy beyond precision and recall, and scalability. Third, by refining our prediction model based on incremental learning, where the same subject matter experts inspect and if applicable relabel prediction outcomes; resulting in gradually improved and changing models. This incremental, human in the loop learning also accounts for the dynamic nature of trends and patters in language use on social media. To support this process, we have designed visual analytics routines that are aimed to support end-users in efficient data inspection, error analysis and close readings. Overall, this comprehensive and collaborate approach to sentiment analysis and evaluation contributes to making scientific research available for industry scale applications.
Issue Date:2015-04-03
Publisher:GSLIS Research Showcase 2015
Genre:Conference Poster
Sponsor:Anheuser Busch, grant number 2014-04922
Date Available in IDEALS:2015-04-22

This item appears in the following Collection(s)

Item Statistics