Enabling Data Retrieval: By Ranking and Beyond
- Enabling Data Retrieval: By Ranking and Beyond
- Li, Chengkai
- Issue Date
- data retrieval
- information retrieval
- The ubiquitous usage of databases for managing structured data, compounded with the expanded reach of the Internet to end users, has brought forward new scenarios of data retrieval. Users often want to express non-traditional fuzzy queries with soft criteria, in contrast to Boolean queries, and to explore what choices are available in databases and how they match the query criteria. Conventional database management systems (DBMSs) have become increasingly inadequate for such new scenarios. Towards enabling data retrieval, this thesis first studies how to fundamentally integrate ranking into databases. We built RankSQL, a DBMS that provides systematic and principled support of ranking queries. With a new ranking algebra and an extended query optimizer for the algebra, RankSQL captures ranking as a first-class construct in databases, together with traditional Boolean constructs. We invented efficient techniques for answering ad-hoc ranking aggregate queries. RankSQL provides significant performance improvement over current DBMSs in processing ranking queries and ranking aggregate queries. This thesis further studies how to enable retrieval mechanisms beyond just ranking. Our explorative study in this direction is exemplified by two novel proposals-- One is to integrate clustering and ranking of database query results; the other is to support inverse ranking queries that provide ranks of objects in query context. Injecting such non-traditional facilities into databases presents non-trivial challenges in both defining query semantics and designing query processing methods. We extended SQL language to express such queries and invented partition- and summary-driven approaches to process them.
- Type of Resource
- Copyright and License Information
- You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).
Edit Collection Membership