Patci — a tool for identifying scientific articles cited by patents
Agarwal, Sneha; Lincoln, Miles; Cai, Haoyan; Torvik, Vetle I.
Loading…
Permalink
https://hdl.handle.net/2142/54885
Description
Title
Patci — a tool for identifying scientific articles cited by patents
Author(s)
Agarwal, Sneha
Lincoln, Miles
Cai, Haoyan
Torvik, Vetle I.
Issue Date
2014-03-14
Keyword(s)
citation matcher
USPTO Patents
PubMed
DBLP
probabilistic matching
bibliographic databases
patent-to-paper citations
Abstract
Scientific research increasingly drives innovation and development of new technologies, and patent-to-paper citations can be used to trace this diffusion of knowledge and measure these science-to-technology spillover effects . However, the so-called “non-patent citations” in USPTO records do not contain authoritative identifiers, nor do they adhere to a standard format. They are strings written in free-form, often much too free, which makes it harder to systematically identify the articles or pieces of work cited. Here, we introduce Patci -- a tool that takes a citation string and probabilistically identifies matching records from a set of bibliographic databases. It currently permits matching to biomedical literature (21.5M PubMed records) and computing/information sciences literature (3.2M DBLP records). It uses a probabilistic model trained on USPTO records but works well for citations originating from outside the patenting sphere. The algorithm extracts and weighs several hundred predictive features and does not rely on punctuation as delimiters of fields. A match probability as attached to each source link ID (e.g., PMID) which permits setting application-appropriate level of match stringency and permits sensitivity analysis. All 16M citations listed in granted USPTO patents (1975-present) have been processed and is available as a separate dataset.
Publisher
GSLIS Research Showcase
Type of Resource
other
Language
en
Permalink
http://hdl.handle.net/2142/54885
Sponsor(s)/Grant Number(s)
National Institute on Aging of the NIH (Award Number P01AG039347)
Science of Science and Innovation Policy program of the NSF (Award Number 0965341)
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.