Files in this item



application/pdfLibraryTrends573_hswe.pdf (981kB)
(no description provided)PDF


Title:The Web Archives Workbench (WAW) Tool Suite: Taking an Archival Approach to the Preservation of Web Content
Author(s):Hswe, Patricia; Kaczmarek, Joanne S.; Houser, Leah; Eke, Janet
Subject(s):Digital preservation
National Digital Information Infrastructure Preservation Program (NDIIPP)
Abstract:The ECHO DEPository (also known as ECHO DEP, an abbreviation for Exploring Collaborations to Harvest Objects in a Digital Environment for Preservation) is an NDIIPP-partner project led by the University of Illinois at Urbana-Champaign in collaboration with OCLC and a consortium of partners, including five state libraries and archives. A core deliverable of the project’s first phase was OCLC’s development of the Web Archives Workbench (WAW), an opensource suite of Web archiving tools for identifying, describing, and harvesting Web-based content for ingestion into an external digital repository. Released in October 2007, the suite is designed to bridge the gap between manual selection and automated capture based on the “Arizona Model,” which applies a traditional aggregate-based archival approach to Web archiving. Aggregate-based archiving refers to archiving items by group or in series, rather than individually. Core functionality of the suite includes the ability to identify Web content of potential interest through crawls of “seed” URLs and the domains they link to; tools for creating and managing metadata for association with harvested objects; website structural analysis and visualization to aid human content selection decisions; and packaging using a PREMIS-based METS profile developed by the ECHO DEPository to support easier ingestion into multiple repositories. This article provides background on the Arizona Model; an overview of how the tools work and their technical implementation; and a brief summary of user feedback from testing and implementing the tools.
Issue Date:2009
Publisher:Johns Hopkins University Press and the Graduate School of Library and Information Science. University of Illinois at Urbana-Champaign
Citation Info:In Library trends 57 (3) Winter 2009: 442-460.
Publication Status:published or submitted for publication
Rights Information:Copyright 2009 Board of Trustees of the University of Illinois.
Date Available in IDEALS:2011-03-15

This item appears in the following Collection(s)

Item Statistics