Files in this item

FilesDescriptionFormat

text/csv

text/csvincopyrightfiction.csv (36MB)
The dataset itself.CSV file

application/octet-stream

application/octet-streamincopyrightdatadictionary.md (5kB)
(no description provided)Unknown

Description

Title:A List of English-Language Fiction after 1922 in HathiTrust
Author(s):Underwood, Ted
Subject(s):fiction, HathiTrust, distant reading, 20th century
Abstract:Metadata for English-language fiction in HathiTrust Digital Library, after 1922. These volumes were identified as fiction algorithmically, using a predictive model trained on text, supplemented by metadata. Algorithmic prediction is imperfect, and this dataset contains errors that the author has not yet had time to fully measure and document. (Measuring recall, for instance, is not trivial.) The data is offered by the author without any promise or warranty. Use it if you find that it is, in practice, better than other alternatives; stop using it as soon as a better alternative becomes available.
Issue Date:2017-09-13
Citation Info:Ted Underwood, "A List of English-Language Fiction after 1922 in HathiTrust," IDEALS, 2017, URL.
Genre:Data
Type:Dataset / Spreadsheet
URI:http://hdl.handle.net/2142/97948
Date Available in IDEALS:2017-09-13


This item appears in the following Collection(s)

Item Statistics