Files in this item

FilesDescriptionFormat

application/pdf

application/pdfZHANG-DISSERTATION-2017.pdf (4MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:Analyzing intentions from big data traces of human activities
Author(s):Zhang, Aston
Director of Research:Gunter, Carl A; Han, Jiawei
Doctoral Committee Chair(s):Gunter, Carl A; Han, Jiawei
Doctoral Committee Member(s):Zhai, ChengXiang; Baeza-Yates, Ricardo
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Data mining, machine learning, optimization, privacy
Abstract:The rapid growth of big data formed by human activities makes research on intention analysis both challenging and rewarding. We study multifaceted problems in analyzing intentions from big data traces of human activities, and such problems span a range of machine learning, optimization, and security and privacy. We show that analyzing intentions from industry-scale human activity big data can effectively improve the accuracy of computational models. Specifically, we take query auto-completion as a case study. We identify two hitherto-undiscovered problems: adaptive query auto-completion and mobile query auto-completion. We develop two computational models by analyzing intentions from big data traces of human activities on search interface interactions and on mobile application usage respectively. Solving the large-scale optimization problems in the proposed query auto-completion models drives deeper studies of the solvers. Hence, we consider the generalized machine learning problem settings and focus on developing lightweight stochastic algorithms as solvers to the large-scale convex optimization problems with theoretical guarantees. For optimizing strongly convex objectives, we design an accelerated stochastic block coordinate descent method with optimal sampling; for optimizing non-strongly convex objectives, we design a stochastic variance reduced alternating direction method of multipliers with the doubling-trick. Inevitably, human activities are human-centric, thus its research can inform security and privacy. On one hand, intention analysis research from human activities can be motivated from the security perspective. For instance, to reduce false alarms of medical service providers' suspicious accesses to electronic health records, we discover potential de facto diagnosis specialties that reflect such providers' genuine and permissible intentions of accessing records with certain diagnoses. On the other hand, we examine the privacy risk in anonymized heterogeneous information networks representing large-scale human activities, such as in social networking. Such data are released for external researchers to improve the prediction accuracy for users' online social networking intentions on the publishers' microblogging site. We show a negative result that makes a compelling argument: privacy must be a central goal for sensitive human activity data publishers.
Issue Date:2017-04-20
Type:Thesis
URI:http://hdl.handle.net/2142/97738
Rights Information:Copyright 2017 Aston Zhang
Date Available in IDEALS:2017-08-10
Date Deposited:2017-05


This item appears in the following Collection(s)

Item Statistics