Files in this item



application/ (1MB)
Taxonomic Name Recognition in Biodiversity Heritage LibraryMicrosoft PowerPoint
Other Available Formats


application/pdfBHL-Qin.ppt.pdf (895kB)
Automatically converted using OpenOffice.orgPDF


Title:Taxonomic Name Recognition in Biodiversity Heritage Library
Author(s):Wei, Qin; Freeland, Chris; Heidorn, P. Bryan
Subject(s):Taxonomic Name Recognition
Biodiversity Data
Abstract:Taxonomic Name Recognition (TNR) algorithm – identifying a text string as a taxonomic name or not and recognizing the boundaries of the name – is very important in BHL digitization project in determining whether the users/researchers could find the materials they want efficiently. The BHL has incorporated TaxonFinder, a taxonomic name finding algorithm and service provided by, into its portal for the identification and verification of taxonomic name strings found within the digitized BHL corpus. An eight-week evaluation was performed to determine the factors affecting the accuracy of the results returned. Our findings are not only valuable for BHL but also for other digital projects that would like to do text mining on their collections. In this evaluation project, we explored and analyzed the factors influencing the performance of: 1) Optical Character Recognition (OCR) for transforming images into text, 2) TNR matching algorithms for identifying taxonomic names from texts, and 3) the completeness of NameBank, which is used as an authority file for name verification.
Issue Date:2008-10-17
Genre:Presentation / Lecture / Speech
Publication Status:published or submitted for publication
Peer Reviewed:is peer reviewed
Date Available in IDEALS:2008-10-26

This item appears in the following Collection(s)

Item Statistics