Files in this item

Files Description Format
untranslated Gu_Randy.pdf (1MB) (no description provided) PDF

Description

Title: Data Cleaning Framework: An Extensible Approach to Data Cleaning
Author(s): Gu, Randy S.
Advisor(s): Chang, Kevin C-C.
Department / Program: Computer Science
Discipline: Computer Science
Degree Granting Institution: University of Illinois at Urbana-Champaign
Degree: M.S.
Genre: Thesis
Subject(s): Data Cleaning
Abstract: The growing dependence of society on enormous quantities of information stored electronically has led to a corresponding rise in errors in this information. The stored data can be critically important, necessitating new ways of correcting anomalous records. Current cleaning techniques are very domain-specific and hard to extend, hindering their use in some areas. This work proposes an extensible framework for data cleaning, allowing users to customize the cleaning to their specific requirements. It defines categories of common cleaning operations, allowing more robust support for user-implemented cleaning functions in these categories. The experimental results show that the proposed data cleaning framework is an effective approach to cleaning data for arbitrary domains.
Issue Date: 2011-01-14
URI: http://hdl.handle.net/2142/18304
Rights Information: Copyright 2010 Randy Siran Gu
Date Available in IDEALS: 2011-01-14
Date Deposited: 2010-12


This item appears in the following Collection(s)

Item Statistics

  • Total Downloads: 200
  • Downloads this Month: 1
  • Downloads Today: 0