Files in this item

Files Description Format
untranslated quiwei_iconf2010_final.pdf (105KB) (no description provided) PDF

Description

Title: Name Matters: Taxonomic Name Recognition (TNR) in Biodiversity Heritage Library (BHL)
Author(s): Wei, Qin; Heidorn, P. Bryan; Freeland, Chris
Subject(s): Taxonomic Name Recognition
TNR
biodiversity informatics
Machine Learning
Digital Libraries
Information Retrieval
Abstract: Taxonomic Name Recognition is prerequisite for more advanced processing and mining of full-text taxonomic literatures. This paper investigates three issues of current TNR tools in detail: (1) The difficulties and methods used in TNRs. (2) The performance of Optical Character Recognition (OCR) and TNR tools by samples from Biodiversity Heritage Library (BHL). (3) The methods for potential improvement. We found that the performances of current TNR techniques need to be improved. A detailed error analysis reveals that sublanguage characteristics account for much of the error. A preliminary experiment using NaiveBayes (NB) models shows the potential of using machine learning (ML) in TNR.
Issue Date: 2010-02-03
Genre: Conference Paper / Presentation
Type: Text
URI: http://hdl.handle.net/2142/14919
Date Available in IDEALS: 2010-02-20


This item appears in the following Collection(s)

Item Statistics

  • Total Downloads: 388
  • Downloads this Month: 2
  • Downloads Today: 0