Files in this item



application/pdfRAI-DISSERTATION-2018.pdf (33MB)
(no description provided)PDF


Title:Inferring landscape preferences from social media using data science techniques
Author(s):Rai, Ankit
Director of Research:Minsker, Barbara
Doctoral Committee Chair(s):Minsker, Barbara
Doctoral Committee Member(s):Diesner, Jana; Smaragdis, Paris; Sullivan, William
Department / Program:Graduate College Programs
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Computer Vision, Machine Learning, Data Science, Social Media, Natural Language Processing, Statistical Methods, Object Detection, Green Infrastructure, Human Preference, Sentiment Analysis, Data Mining
Abstract:People and societies attribute different values to landscapes, which are often derived from their preferences. Such preferences are shaped by aesthetics, recreational benefits, safety, and other services provided by landscapes. Researchers have found that more appealing landscapes can promote human health and well-being. Existing methods used to study landscape preferences, such as social surveys, create high quality data but have high cost of time and effort and are poorly suited to capture dynamic landscape-scale changes across large geographic scales. With the rapid rise in social media, a huge amount of user-generated data is now available for researchers to study emotions or sentiments (i.e., preferences) towards particular topics of interest. This dissertation investigates how social media data can be used to indirectly measure (Zanten et al., 2016) and learn features relevant to landscape preferences, focusing primarily on a specific landscape called green infrastructure (GI). The first phase of the work introduces a first-ever benchmark GI location dataset within the US (GReen Infrastructure Dataset, or GRID) and develops a computer vision algorithm for identifying GI from aerial images using Google/Bing Map API. The data collected from this object detection method is then used to re-train a human preference model developed previously (Rai, 2013) and it improved the prediction accuracy significantly. I found that with the framework introduced here, we can collect the landscape data, which is comparable to the current methods in terms of quality with much less efforts. Second phase uses GI images and textual comments from Flickr, Instagram, and Twitter to train a lexicon-based sentiment model for predicting people's sentiments for GI. Since almost 70 percent of US adults are using some social media platform to connect with their friends, families or to follow recent news and topic of interest (Pew research, 2015), it is imperative to understand whether people share, post, or comment about the landscape settings they live in or prefer. And the results show that social media information can be really useful in predicting people’s sentiments about landscape they live or visit. The third phase builds on the second phase to identify specific features that are correlated with higher and lower preferences. The findings demonstrate that we can learn features that impacts people’s preference about the landscape. These features are very descriptive that a layperson can understand and can also be useful for designers, storm-water engineers, city planners to incorporate in their landscape designs such that it improves human health and well- being. Finally, I will conclude and describe some follow up research that I think would be potential in understanding landscape: work on speeding up the object detection algorithms using more advanced computer vision methods and harnessing the power of GPUs and extension of the findings to other types of GI and landscape designs.
Issue Date:2018-04-17
Rights Information:Copyright 2018 Ankit Rai
Date Available in IDEALS:2018-09-04
Date Deposited:2018-05

This item appears in the following Collection(s)

Item Statistics