Blog

Helping researchers identify content they can text mine

TL;DR Many organizations are doing what they can to aid in the response to the COVID-19 pandemic. Crossref members can make it easier for researchers to identify, locate, and access content for text mining. In order to do this, members must include elements in their metadata that: Point to the full text of the content. Indicate that the content is available under an open access license or that it is being made available for free (gratis).

Free public data file of 112+ million Crossref records

A lot of people have been using our public, open APIs to collect data that might be related to COVID-19. This is great and we encourage it. We also want to make it easier. To that end we have made a free data file of the public elements from Crossref’s 112.5 million metadata records. The file (65GB, in JSON format) is available via Academic Torrents here: https://doi.org/10.13003/83B2GP It is important to note that Crossref metadata is always openly available.

You’ve had your say, now what? Next steps for schema changes

It seems like ages ago, particularly given recent events, but we had our first public request for feedback on proposed schema updates in December and January. The feedback we received indicated two big things: we’re on the right track, and you want us to go further. This update has some significant but important changes to contributors, but is otherwise a fairly moderate update. The feedback was mostly supportive, with a fair number of helpful suggestions about details.

Encouraging even greater reporting of corrections and retractions

TL;DR: We no longer charge fees for members to participate in Crossmark, and we encourage all our members to register metadata about corrections and retractions - even if you can’t yet add the Crossmark button and pop-up box to your landing pages or PDFs.

Events got the better of us

Publisher metadata is one side of the story surrounding research outputs, but conversations, connections and activities that build further around scholarly research, takes place all over the web. We built Event Data to capture, record and make available these ‘Events’ –– providing open, transparent, and traceable information about the provenance and context of every Event. Events are comments, links, shares, bookmarks, references, etc.

Metadata Manager Update

At Crossref, we’re committed to providing a simple, usable, efficient and scalable web-based tool for registering content by manually making deposits of, and updates to, metadata records. Last year we launched Metadata Manager in beta for journal deposits to help us explore this further. Since then, many members have used the tool and helped us better understand their needs.

Double trouble with DOIs

Detective Matcher stopped abruptly behind the corner of a short building, praying that his loud heartbeat doesn’t give up his presence. This missing DOI case was unlike any other before, keeping him awake for many seconds already. It took a great effort and a good amount of help from his clever assistant Fuzzy Comparison to make sense of the sparse clues provided by Miss Unstructured Reference, an elegant young lady with a shy smile, who begged him to take up this case at any cost.

Crossref metadata for bibliometrics

Our paper, Crossref: the sustainable source of community-owned scholarly metadata, was recently published in Quantitative Science Studies (MIT Press). The paper describes the scholarly metadata collected and made available by Crossref, as well as its importance in the scholarly research ecosystem.

Using the Crossref REST API (with Open Ukrainian Citation Index)

Over the past few years, I’ve been really interested in seeing the breadth of uses that the research community is finding for the Crossref REST API. When we ran Crossref LIVE Kyiv in March 2019, Serhii Nazarovets joined us to present his plans for the Open Ukrainian Citation Index, an initiative he explains below. But first an introduction to Serhii and his colleague Tetiana Borysova. Serhii Nazarovets is a Deputy Director for Research at the State Scientific and Technical Library of Ukraine.

Proposed schema changes - have your say

The first version of our metadata input schema (a DTD, to be specific) was created in 1999 to capture basic bibliographic information and facilitate matching DOIs to citations. Over the past 20 years the bibliographic metadata we collect has deepened, and we’ve expanded our schema to include funding information, license, updates, relations, and other metadata. Our schema isn’t as venerable as a MARC record or as comprehensive as JATS, but it’s served us well.