Files in this item



application/pdfBENDRE-DISSERTATION-2018.pdf (4MB)
(no description provided)PDF


Title:Towards unifying spreadsheets with databases for ad-hoc interactive data management at scale
Author(s):Bendre, Mangesh
Director of Research:Parameswaran, Aditya
Doctoral Committee Chair(s):Parameswaran, Aditya
Doctoral Committee Member(s):Chang, Kevin; Zhai, ChengXiang; Nandi, Arnab
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
big-data management
Abstract:We are witnessing the increasing availability of data across a spectrum of domains, necessitating the interactive ad-hoc management and analysis of this data, in order to put it to use. Unfortunately, interactive ad-hoc management of very large datasets presents a host of challenges, ranging from performance to interface usability. This thesis introduces a new research direction of manipulation of large datasets using an interactive interface and makes several steps towards this direction. In particular, we develop DataSpread, a tool that enables users to work with arbitrary large datasets via a direct manipulation interface. DataSpread holistically unifies spreadsheets and relational databases to leverage the benefits of both. However, this holistic integration is not trivial due to the differences in the architecture and ideologies of the two paradigms: spreadsheets and databases. We have built a prototype of DataSpread, which, in addition to motivating the underlying challenges, demonstrates the feasibility and usefulness of this holistic integration. We focus on the following challenges encountered while developing DataSpread. (i) Representation—here, we address the challenges of flexibly representing ad-hoc spreadsheet data within a relational database; (ii) Indexing—here, we develop indexing data structures for supporting and maintaining access by position; (iii) Formula Computation—here, we introduce an asynchronous formula computation framework that addresses the challenge of ensuring consistency and interactivity at the same time; and (iv) Organization—here, we develop a framework to best organize data based on a workload, e.g., queries specified on the spreadsheet interface.
Issue Date:2018-12-05
Rights Information:Copyright 2018 Mangesh Bendre
Date Available in IDEALS:2019-02-06
Date Deposited:2018-12

This item appears in the following Collection(s)

Item Statistics