Files in this item



application/pdfJackson_Larry.pdf (2MB)
(no description provided)PDF


Title:Website Structure
Author(s):Jackson, Larry S.
Director of Research:Dubin, David
Doctoral Committee Chair(s):Renear, Allen H.
Doctoral Committee Member(s):Dubin, David; Haythornthwaite, Caroline A.; Moen, William E.; La Barre, Kathryn A.
Department / Program:Graduate School of Library and Information Science
Discipline:Library and Information Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):website structure
hypertext graph
linear discriminant analysis
linear classifier
website archiving
state government websites
Illinois State Library
Abstract:This dissertation reports the results of an exploratory data analysis investigation of the relationship between the structures used for information organization and access and the associated storage structures within state government websites. Extending an earlier claim that hierarchical directory structures are both the preeminent information organization and file storage mechanism, three different classes of overall website structure were found to be identifiable by linear classifiers, when trained on features of the website hypertext graphs. Two more structural types, not analyzed with the classifiers, were suggested through an examination of misclassified websites. Further, the notion of website structure was found to be best modeled recursively, allowing variation on a sub-graph level, instead of deeming a structural class to apply to the entirety of a website. Linear discriminant analysis was used to construct a series of experimental classifiers, using subsets of ten features identified by either earlier classifiers or principal components analysis. Two groups of features, seemingly reflecting website size and graph density, were found to convey somewhat redundant information to the classifiers, in this application. A number of other practices in website implementation were uncovered that engender classifier errors, arguing for either the deliberate inclusion of websites having these properties in the training dataset, or the expansion of the feature set. Hierarchical cluster analysis and blockmodeling of whole-website graphs were also briefly investigated, and found to occasionally contribute file relatedness information of fundamentally distinct types, and information sometimes at variance with directory structure usage for file storage. Multiple literatures suggest a number of social factors that may influence the way websites and webpages are constructed within an organization, particularly the differing types of administrative control in bureaucracies, and the nature of help-seeking in technology work. While traces reminiscent of these suggestions were encountered, investigation of social causal factors behind website structural choices in the organizational types and workplace styles of the sponsoring agency remains a task for other researchers.
Issue Date:2009-06-01
Rights Information:Copyright 2009 Larry S. Jackson
Date Available in IDEALS:2009-06-01
Date Deposited:May 2009

This item appears in the following Collection(s)

Item Statistics