Withdraw
Loading…
Data Quality Assessment of Retracted Papers: Patterns and Retraction Status Shifts in Titles
Si, Luyang; Salami, Malik Oyewale; Schneider, Jodi
Loading…
Permalink
https://hdl.handle.net/2142/129059
Description
- Title
- Data Quality Assessment of Retracted Papers: Patterns and Retraction Status Shifts in Titles
- Author(s)
- Si, Luyang
- Salami, Malik Oyewale
- Schneider, Jodi
- Issue Date
- 2025-07-18
- Keyword(s)
- Crossref
- Data Quality
- Metadata
- Retraction
- Retraction Indexing
- Retraction Labeling
- Retraction Status
- RISRS
- Date of Ingest
- 2025-09-08T17:39:24-05:00
- Abstract
- The term “retracted paper” refers to a paper that is officially flagged by the publishing company due to flaws or errors in its content or data. Once a paper is retracted, it is no longer seen as a reliable source of evidence and should not be cited as evidence. To avoid unintentionally citing retracted papers, researchers need to be able to determine which papers are retracted. However, this is difficult because sources have significant discrepancies: one in five articles was indexed as both retracted and not retracted, across Crossref, Retraction Watch, Scopus, and Web of Science, according to our prior work. One reason for these discrepancies is incorrect indexing of DOIs, especially retraction notices and corrigenda. Here, we investigate the most significant discrepancy: 9,937 DOIs that Crossref, but no other sources indexed as retracted publications – meaning that many are likely not retracted at all. In this project, we are combining components of our prior work. Our previous work made pattern-based predictions that labeled each DOI as 'likely_retracted' or 'likely_retraction' based on phrases such as retracted, [retracted], retraction, correction, and correction to (which we call “retraction status patterns”) found at the beginning or end of the title, or as the full title. Here, we manually inspected each DOI and categorized it as: (i) a retracted paper, (ii) a retraction notice, (iii) both a retracted paper and a retraction notice, or (iv) neither a retracted paper nor a retraction notice. We compared our pattern-based predictions to our manually assigned categories to test how reliable our pattern-based predictions are at distinguishing between retracted publications and retraction notices. We are particularly interested in testing how consistently different journals use retraction phrases in their titles. In the future, we will use validated retraction status patterns to remove corrections and retraction notices from our union list of retracted publications. We will communicate our results to Crossref in order to encourage them to make corrections in their data. Our research will ultimately benefit the entire research ecosystem as data from Crossref is ingested into many bibliographic databases.
- Has Part
- https://hdl.handle.net/2142/125231
- https://doi.org/10.13012/B2IDB-9099305_V1
- https://doi.org/10.13012/B2IDB-2907908_V1
- https://doi.org/10.13012/B2IDB-5333456_V1
- https://doi.org/10.55835/6441e5cae04dbe5586d06a5f
- https://zenodo.org/records/8336538
- https://hdl.handle.net/2142/129058
- Type of Resource
- text
- Genre of Resource
- presentation/lecture/speech
- Language
- eng
- Sponsor(s)/Grant Number(s)
- Alfred P. Sloan Foundation G-2022-19409
- U.S. National Science Foundation 2046454
- Perrin Moorhead Grayson and Bruns Grayson Fellowship at the Harvard Radcliffe Institute for Advanced Study (2024-2025)
Owning Collections
Manage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…