Files in this item

FilesDescriptionFormat

application/pdf

application/pdfSP20-ECE499-Thesis-Zhang, Tailin.pdf (635kB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:The discovery of relationships among scientific datasets
Author(s):Tailin, Zhang
Contributor(s):Alawini, Abdussalam; Shomorony, Ilan
Subject(s):Spreadsheet Data Management
Clustering Optimization
Relationship Prediction
Abstract:When recording experimental data in research, scientists can often accumulate thousands of datasets, which may have different data types, formats, and styles. This can make it difficult for scientists to select the right subsets for analysis, sharing or storing into structured databases. Determining relationships between large file-based datasets can be very helpful for scientists who store their datasets in file-based formats, such as spreadsheets or CSVs. In this project, we are creating a system for predicting relationships between datasets stored in spreadsheets. With the predicted result, we can further identify the most complete version of a dataset, link related data elements together and discard redundant or unrelated datasets.
Issue Date:2020-05
Genre:Other
Type:Text
Language:English
URI:http://hdl.handle.net/2142/107760
Date Available in IDEALS:2020-07-10


This item appears in the following Collection(s)

Item Statistics