Files in this item



application/pdf3392471.pdf (2MB)Restricted to U of Illinois
(no description provided)PDF


Title:Challenges in Managing Information Extraction
Author(s):Shen, Warren H.
Doctoral Committee Chair(s):Doan, AnHai
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Computer Science
Abstract:In this dissertation, we develop solutions to the key challenges mentioned above. First, we develop a declarative framework that can help make it easier for developers to write and understand IE programs, and show how to automatically optimize IE programs written in this framework to reduce runtime. Next, given that relational database systems (RDBMSs) were designed to store and process large data sets, we study the benefits and limitations of employing RDBMSs for storing and processing data in IE applications. Finally, we extend our declarative framework to enable best-effort IE, allowing developers to more easily write and refine approximate IE programs. A key idea underlying these solutions is that many of the principles behind RDBMSs for managing structured data can be extended to IE for managing unstructured data.
Issue Date:2009
Description:120 p.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2009.
Other Identifier(s):(MiAaPQ)AAI3392471
Date Available in IDEALS:2015-09-25
Date Deposited:2009

This item appears in the following Collection(s)

Item Statistics