Files in this item



application/pdfSP20-ECE499-Thesis-Zhang, Tailin.pdf (635kB)Restricted to U of Illinois
(no description provided)PDF


Title:The discovery of relationships among scientific datasets
Author(s):Tailin, Zhang
Contributor(s):Alawini, Abdussalam; Shomorony, Ilan
Subject(s):Spreadsheet Data Management
Clustering Optimization
Relationship Prediction
Abstract:When recording experimental data in research, scientists can often accumulate thousands of datasets, which may have different data types, formats, and styles. This can make it difficult for scientists to select the right subsets for analysis, sharing or storing into structured databases. Determining relationships between large file-based datasets can be very helpful for scientists who store their datasets in file-based formats, such as spreadsheets or CSVs. In this project, we are creating a system for predicting relationships between datasets stored in spreadsheets. With the predicted result, we can further identify the most complete version of a dataset, link related data elements together and discard redundant or unrelated datasets.
Issue Date:2020-05
Date Available in IDEALS:2020-07-10

This item appears in the following Collection(s)

Item Statistics