Files in this item



application/pdfCHANG-DISSERTATION-2016.pdf (4MB)
(no description provided)PDF


Title:Similarity learning in the era of big data
Author(s):Chang, Shiyu
Director of Research:Huang, Thomas S.
Doctoral Committee Chair(s):Huang, Thomas S.
Doctoral Committee Member(s):Hasegawa-Johnson, Mark A.; Liang, Zhi-Pei; Aggarwal, Charu C.
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Similarity Learning
Big Data
Large Volume Data
Multimodality Data
High-velocity Data
Supervised Similarity Learning
Network Embedding
Deep Embedding
Heterogeneous Network
Streaming Network
Positive-Unlabeled Learning
Link Prediction
Social Media
Search and Retrieval
Abstract:This dissertation studies the problem of similarity learning in the era of big data with heavy emphasis on real-world applications in social media. As in the saying “birds of a feather flock together,” in similarity learning, we aim to identify the notion of being similar in a data-driven and task-specific way, which is a central problem for maximizing the value of big data. Despite many successes of similarity learning from past decades, social media networks as one of the most typical big data media contain large-volume, various and high-velocity data, which makes conventional learning paradigms and off- the-shelf algorithms insufficient. Thus, we focus on addressing the emerging challenges brought by the inherent “three-Vs” characteristics of big data by answering the following questions: 1) Similarity is characterized by both links and node contents in networks; how to identify the contribution of each network component to seamlessly construct an application orientated similarity function? 2) Social media data are massive and contain much noise; how to efficiently learn the similarity between node pairs in large and noisy environments? 3) Node contents in social media networks are multi-modal; how to effectively measure cross-modal similarity by bridging the so-called “semantic gap”? 4) User wants and needs, and item characteristics, are continuously evolving, which generates data at an unprecedented rate; how to model the nature of temporal dynamics in principle and provide timely decision makings? The goal of this dissertation is to provide solutions to these questions via innovative research and novel methods. We hope this dissertation sheds more light on similarity learning in the big data era and broadens its applications in social media.
Issue Date:2016-11-04
Rights Information:Copyright 2016 Shiyu Chang
Date Available in IDEALS:2017-03-01
Date Deposited:2016-12

This item appears in the following Collection(s)

Item Statistics