Files in this item



audio/mpegefron.mp3 (11MB)
Audio filemp3 audio


application/vnd.openxmlformats-officedocument.presentationml.presentationefron.pptx (555kB)
PowerPoint Presentation/SlidesMicrosoft PowerPoint 2007


Title:Hitting a Moving Target: Historical language change and information retrieval
Author(s):Efron, Miles
Subject(s):information retrieval
Abstract:Massive book digitization projects such as Google Books offer readers new ways to interact with textual information. The full-text indexing that accompanies digitization lets us search for information, quickly delivering putatively relevant book passages for our attention. However, keyword-based information retrieval fails to meet the challenge posed by repositories such as Google Books, due in large part to language change. In this talk, I will describe ongoing research aimed at improving our ability to find information across bodies of temporally diverse text. A person researching the origins and history of the proverb "all that glitters is not gold" would like to see passages containing the saying itself. But he or she is also likely to be interested in Chaucer's verse, "alle is not golde that glareth" or Lydgate's "alle is not golde that shewyth goldishe hewe." Since the 15th Century (and indeed before that), English has changed, frustrating the methods of information retrieval that serve us well in contemporary text. The subject of this talk is a novel statistical framework that allows searches posed in, say, 21st-Century English to retrieve texts written in Middle English, Early Modern English, as well as familiar modern idioms.
Issue Date:2011
Publisher:Graduate School of Library and Information Science. University of Illinois at Urbana-Champaign.
Genre:Presentation / Lecture / Speech
Date Available in IDEALS:2012-03-12

This item appears in the following Collection(s)

Item Statistics