Files in this item



application/pdf3070037.pdf (10MB)Restricted to U of Illinois
(no description provided)PDF


Title:Integrating Similarity Based Retrieval and Query Refinement in Databases
Author(s):Ortega-Binderberger, Michael
Doctoral Committee Chair(s):Sharad Mehrotra
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Library Science
Abstract:With the emergence of many application domains that require imprecise similarity-based access to information, techniques to support such a retrieval paradigm over database systems have emerged as a critical area of research. There are two major areas to this search paradigm. The first one is how to interpret similarity search queries in a relational database context. We address this problem by extending relational operators to natively understand similarity-based retrieval and provide similarity operators that act on user-defined data-types. The second major research area is how to interactively improve the query through user interaction (query refinement). We address this problem by extending some information retrieval and machine learning techniques to affect the interpretation of similarity predicates and the query search condition itself. The result is a new query that better satisfies the users information need. The semantics of this domain favor a "top-k" retrieval approach where we only seek the best matching results for a query. We therefore developed for each of the two areas (similarity search and query refinement) several techniques to efficiently process queries. For similarity queries our query processing algorithms naturally support the top-k retrieval paradigm by returning the answers in order of their similarity to the query. For query refinement, our query processing algorithms reuse the results from previous queries to quickly answer the new refined queries. The algorithms avoid duplicating work done before and perform only the minimum work needed to return the next answer, thus resulting in up to an order of magnitude performance improvement over a naive re-execution of a refined query.
Issue Date:2002
Description:169 p.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2002.
Other Identifier(s):(MiAaPQ)AAI3070037
Date Available in IDEALS:2015-09-25
Date Deposited:2002

This item appears in the following Collection(s)

Item Statistics