Files in this item



application/pdfECE499-Sp2018-qian-Yanli.pdf (1MB)Restricted to U of Illinois
(no description provided)PDF


Title:Entity sematic based system for question and answer search
Author(s):Qian, Yanli
Contributor(s):Chang, Kevin C.C.
Subject(s):entity semantic search
pattern search
natural language processing
Abstract:Entity semantic search aims at allowing users to search entity patterns inside documents, where entities are referred to as semantic data objects. The system is built over the open-source web server Elasticsearch and Apache Lucene. Through performing Named Entity Recognition on plain Q&A corpus, entities are extracted and annotated with respect to the categories they belong to. Each entity is indexed into JSON format document following the principle of inverted index, and the resultant documents are imported into Elasticsearch for further query operation. A plugin is built for the purpose of clustering and ranking the query results. It contains a RESTful handler which has a customized response handler and a request handler. Before the start of the Elasticsearch server, the plugin will be loaded and the nature of it is to extend the Elasticsearch runtime by adding a RESTful endpoint. The plugin will help to group the Elasticsearch results by entity content and the results that share the same entity content will be placed in the same cluster. The system is visualized via a web interface. The thesis elaborates the innovativeness of searching entity patterns inside documents, and methods or models we used to built the entire search engine.
Issue Date:2018-05
Sponsor:Huawei Limited Technologies Co., Ltd.
Date Available in IDEALS:2018-05-24

This item appears in the following Collection(s)

Item Statistics