Into to the Crossref APIverse

Author
Affiliation

Luis Montilla

Crossref

Published

January 12, 2024

So you want to retrieve data from Crossref using R. If you want to jump to the coding part, click here, but if you are new to APIs and/or Crossref role in the scholarly metadata ecosystem, let me invite to read more below.

Crossref and the research nexus

At Crossref, we make research objects easy to find, cite, link, assess, and reuse. Our goal is to connect all research knowledge through open and persistent metadata. We have termed this interconnected knowledge ecosystem, the Research Nexus.

The Research Nexus

The metadata we collect and make available are the identifiers and properties of different items in the scholarly record. We don’t collect the full text or the actual datasets that constitute a scholarly paper, instead, we are interested in relationships between the paper, the authors, institutions, journals, funding agencies, protocols, and more.

We make this metadata open through our REST API

Our data is readily available. You have three access levels to the API metadata:

  • Public. Free, fully anonymous.

  • Polite. Free, you provide your email. (Recommended).

    We can use this information to contact you in case of issues. We get rid of it after 90 days.
  • Plus. Our paid premium service. You get:

    • A service level agreement guaranteeing you extra service and support, giving you a consistent and predictable experience.
    • Additional features such as snapshots and priority service/rate limits.

What’s an API by the way?

An API is a software intermediary that allows two applications to talk to each other. This means that you don’t directly get the data as would happen when you download data associated to a research paper. In this document, we use the rcrossref package to build the requests that the Crossref API handles to return the data from Crossref servers.

flowchart LR
  A(Client) --> B{API}
  B --> C(Server)
  C --> B
  B --> A

Ready to dive into some code?

Libraries

We’ll make use of the rcrossref Chamberlain et al. (2022) package to interect with the Crossref API from the comfort of our R environment. We’ll also load gt to quickly visualize better-looking data tables.

Code
library(rcrossref)
library(tidyverse)
library(gt)
library(magrittr)

I will be polite. How do I provide my information?

We should add an email to ensure proper communication with Crossref in case our API requests break something. We can set this as a system variable:

Code
Sys.setenv(crossref_email = "yourname@yourorg.org")

Or we can directly type it in the environment file by first executing:

Code
file.edit("~/.Renviron")

Then, adding the email address to be shared with Crossref as crossref_email = name@example.com

Finally, save the file and restart your R session.

Let’s start with some examples:

Other resources

References

Chamberlain, Scott, Hao Zhu, Najko Jahn, Carl Boettiger, and Karthik Ram. 2022. Rcrossref: Client for Various ’CrossRef’ ’APIs’. https://CRAN.R-project.org/package=rcrossref.

License

Into to the Crossref APIverse by Luis Montilla is licensed under CC BY 4.0