Files in this item



text/csvincopyrightfiction.csv (36MB)
The dataset itself.CSV file


application/ (5kB)
(no description provided)Unknown


Title:A List of English-Language Fiction after 1922 in HathiTrust
Author(s):Underwood, Ted
Subject(s):fiction, HathiTrust, distant reading, 20th century
Abstract:Metadata for English-language fiction in HathiTrust Digital Library, after 1922. These volumes were identified as fiction algorithmically, using a predictive model trained on text, supplemented by metadata. Algorithmic prediction is imperfect, and this dataset contains errors that the author has not yet had time to fully measure and document. (Measuring recall, for instance, is not trivial.) The data is offered by the author without any promise or warranty. Use it if you find that it is, in practice, better than other alternatives; stop using it as soon as a better alternative becomes available.
Issue Date:2017-09-13
Citation Info:Ted Underwood, "A List of English-Language Fiction after 1922 in HathiTrust," IDEALS, 2017, URL.
Type:Dataset / Spreadsheet
Date Available in IDEALS:2017-09-13

This item appears in the following Collection(s)

Item Statistics