Files in this item



application/pdfquiwei_iconf2010_final.pdf (105kB)
(no description provided)PDF


Title:Name Matters: Taxonomic Name Recognition (TNR) in Biodiversity Heritage Library (BHL)
Author(s):Wei, Qin; Heidorn, P. Bryan; Freeland, Chris
Subject(s):Taxonomic Name Recognition
biodiversity informatics
Machine Learning
Digital Libraries
Information Retrieval
Abstract:Taxonomic Name Recognition is prerequisite for more advanced processing and mining of full-text taxonomic literatures. This paper investigates three issues of current TNR tools in detail: (1) The difficulties and methods used in TNRs. (2) The performance of Optical Character Recognition (OCR) and TNR tools by samples from Biodiversity Heritage Library (BHL). (3) The methods for potential improvement. We found that the performances of current TNR techniques need to be improved. A detailed error analysis reveals that sublanguage characteristics account for much of the error. A preliminary experiment using NaiveBayes (NB) models shows the potential of using machine learning (ML) in TNR.
Issue Date:2010-02-03
Genre:Conference Paper / Presentation
Date Available in IDEALS:2010-02-20

This item appears in the following Collection(s)

Item Statistics