IDEALS Home University of Illinois at Urbana-Champaign logo The Alma Mater The Main Quad

Using Speech Input for Image Interpretation, Annotation, and Retrieval

Show full item record

Bookmark or cite this item: http://hdl.handle.net/2142/25949

Files in this item

File Description Format
PDF Srihari_Using.pdf (9MB) (no description provided) PDF
Title: Using Speech Input for Image Interpretation, Annotation, and Retrieval
Author(s): Srihari, Rohini K.
Subject(s): Digital Libraries electronic information resources digital images image access image retrieval linguistic context caption based audio annotation
Abstract: "This research explores the interaction of textual and photographic information in an integrated text/image database environment. Specifically, three different applications involving the exploitation of linguistic con-text in vision are presented. Linguistic context is qualitative in nature and is obtained dynamically. By understanding text accompanying images or video, we are able to extract information useful in retrieving the picture and directing an image interpretation system to identify relevant objects (e.g., faces) in the picture. The latter constitutes a powerful technique for automatically indexing images. A multistage system, PICTION, which uses captions to identify human faces in an accompanying photograph, has been developed. We discuss the use of PICTION's output in content-based retrieval of images to satisfy focus of attention in queries. The design and implementation of a system called Show&Tell—a multimedia system for semi-automated image annotation—is discussed. This system, which combines advances in speech recognition, natural language processing (NLP), and image understanding (IU), is designed to assist in image annotation and to enhance image retrieval capabilities. An extension of this work to video annotation and retrieval is also presented."
Issue Date: 1997
Publisher: Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Citation Info: Srihari, Rohini K. (1997) Using Speech Input for Image Interpretation, Annotation, and Retrieval. In P. Bryan Heidorn; Beth Sandore (Eds.) (1997) Digital Image Access & Retrieval [papers presented at the 1996 Clinic on Library Applications of Data Processing, March 24-26, 1996 Urbana-Champaign] : Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign: 140-156.
Series/Report: Digital Image Access & Retrieval [papers presented at the 1996 Clinic on Library Applications of Data Processing, March 24-26, 1996 Urbana-Champaign]
Genre: Conference Paper / Presentation
Type: Text
Language: English
URI: http://hdl.handle.net/2142/25949
ISBN: 0-87845-100-5
ISSN: 0069-4789
Publication Status: published or submitted for publication
Date Available in IDEALS: 2011-08-17
 

This item appears in the following Collection(s)

Show full item record

Item Statistics

  • Total Downloads: 126
  • Downloads this Month: 6
  • Downloads Today: 0

Browse

My Account

Information

Access Key