Files in this item



application/pdfTao_Cheng.pdf (1MB)
(no description provided)PDF


Title:Toward Entity-Aware Search
Author(s):Cheng, Tao
Director of Research:Chang, Kevin C-C.
Doctoral Committee Chair(s):Han, Jiawei
Doctoral Committee Member(s):Chang, Kevin C-C.; Zhai, ChengXiang; Weikum, Gerhard
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Entity Search
Entity-aware Search
Entity Indexing
Entity Synonym
Content Query Language
Abstract:As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability.
Issue Date:2011-01-14
Rights Information:Copyright 2010 Tao Cheng
Date Available in IDEALS:2011-01-14
Date Deposited:2010-12

This item appears in the following Collection(s)

Item Statistics