Files in this item



application/mswordPO20_iconf08.doc (364kB)
(no description provided)Microsoft Word
Other Available Formats


application/pdfPO20_iconf08.doc.pdf (385kB)
Automatically converted using OpenOffice.orgPDF


Title:Proposal of Document Classification with Word Sense Disambiguation
Author(s):Liu, Xiaozhong
Subject(s):Document classification
Abstract:An important area of NLP is the study of Word Sense Disambiguation (WSD), which may assign a unique word sense to a word. There are different methods to implement WSD: one is the word sense based on the collocation of other words (Yarowsky, 1993), where nearby words provide strong consistent clues to the sense of a target word, conditional on relative distance, order and syntactic relationship; and the other is the word sense based on discourse (Gale et al, 1992), where the sense is consistent within any given document. Many experiments in recent years of both supervised (Leacock 1993) and unsupervised (Yarowsky, 1993) WSD algorithms have accomplished promising performance with a high precision rate. Another important area in the field of text mining (Lewis & Spark Jones papers) is document classification, which identifies one or more of several topic labels for a text document. A significant body of research has improved the results of document classification, with innovations in identifying document features as well as improving algorithms. In this study, we will use WSD as part of a method to create innovative features to represent the documents for classification task. With the help of WSD, a set of specially selected ambiguous words can be further distinguished by word sense clustering, in order to achieve better document classification.
Issue Date:2008-02-28
Genre:Conference Poster
Date Available in IDEALS:2010-03-10

This item appears in the following Collection(s)

Item Statistics