Files in this item



application/pdfGu_Randy.pdf (1MB)
(no description provided)PDF


Title:Data Cleaning Framework: An Extensible Approach to Data Cleaning
Author(s):Gu, Randy S.
Advisor(s):Chang, Kevin C-C.
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Data Cleaning
Abstract:The growing dependence of society on enormous quantities of information stored electronically has led to a corresponding rise in errors in this information. The stored data can be critically important, necessitating new ways of correcting anomalous records. Current cleaning techniques are very domain-specific and hard to extend, hindering their use in some areas. This work proposes an extensible framework for data cleaning, allowing users to customize the cleaning to their specific requirements. It defines categories of common cleaning operations, allowing more robust support for user-implemented cleaning functions in these categories. The experimental results show that the proposed data cleaning framework is an effective approach to cleaning data for arbitrary domains.
Issue Date:2011-01-14
Rights Information:Copyright 2010 Randy Siran Gu
Date Available in IDEALS:2011-01-14
Date Deposited:2010-12

This item appears in the following Collection(s)

Item Statistics