Blog

URLs and DOIs: a complicated relationship

As the linking hub for scholarly content, it’s our job to tame URLs and put in their place something better. Why? Most URLs suffer from link rot and can be created, deleted or changed at any time. And that’s a problem if you’re trying to cite them.

Using the Crossref Metadata API. Part 2 (with PaperHive)

We first met the team from PaperHive at SSP in June, pointed them in the direction of the Crossref Metadata API and let things progress from there. That’s the nice thing about having an API - because it’s a common and easy way for developers to access and use metadata, it makes it possible to use with lots of diverse systems and services.

So how are things going? Alexander Naydenov, PaperHive’s Co-founder gives us an update on how they’re working with the Crossref metadata:

Using AWS S3 as a large key-value store for Chronograph

One of the cool things about working in Crossref Labs is that interesting experiments come up from time to time. One experiment, entitled “what happens if you plot DOI referral domains on a chart?” turned into the Chronograph project. In case you missed it, Chronograph analyses our DOI resolution logs and shows how many times each DOI link was resolved per month, and also how many times a given domain referred traffic to DOI links per day.

We’ve released a new version of Chronograph. This post explains how it was put together. One for the programmers out there.

HTTPS and Wikipedia

This is a joint blog post with Dario Taraborelli, coming from WikiCite 2016.

In 2014 we were taking our first steps along the path that would lead us to Crossref Event Data. At this time I started looking into the DOI resolution logs to see if we could get any interesting information out of them. This project, which became Chronograph, showed which domains were driving traffic to Crossref DOIs.

You can read about the latest results from this analysis in the “Where do DOI Clicks Come From” blog post.

Having this data tells us, amongst other things:

  • where people are using DOIs in unexpected places
  • where people are using DOIs in unexpected ways
  • where we knew people were using DOIs but the links are more popular than we realised

Where do DOI clicks come from?

As part of our Event Data work we’ve been investigating where DOI resolutions come from. A resolution could be someone clicking a DOI hyperlink, or a search engine spider gathering data or a publisher’s system performing its duties. Our server logs tell us every time a DOI was resolved and, if it was by someone using a web browser, which website they were on when they clicked the DOI. This is called a referral.

This information is interesting because it shows not only where DOI hyperlinks are found across the web, but also when they are actually followed. This data allows us a glimpse into scholarly citation beyond references in traditional literature.

Crossref Event Data: early preview now available

Crossref Event Data logo

Test out the early preview of Event Data while we continue to develop it. Share your thoughts. And be warned: we may break a few eggs from time to time!

Egg

Chicken by anbileru adaleru from the The Noun Project

Want to discover which research works are being shared, liked and commented on? What about the number of times a scholarly item is referenced? Starting today, you can whet your appetite with an early preview of the forthcoming Crossref Event Data service. We invite you to start exploring the activity of DOIs as they permeate and interact with the world after publication.

Event Data: open for your interpretation

What happens to a research work outside of the formal literature? That’s what Event Data will aim to answer when the service launches later this year.

Crossref Event Data Logo

Following the successful DOI Event Tracker pilot in Spring 2014, development has been underway to build our new service, newly re-named Crossref Event Data. It’s an open data service that registers online activity (specifically, events) associated with Crossref metadata. Event Data will collect and store a record of any activity surrounding a research work from a defined set of web sources. The data will be made available as part of our metadata search service or via our Metadata API and normalised across a diverse set of sources. Data will be open, audit-able and replicable.

DOIs in Reddit

Skimming the headlines on Hacker News yesterday morning, I noticed something exciting. A dump of all the submissions to Reddit since 2006. “How many of those are DOIs?”, I thought. Reddit is a very broad community, but has some very interesting parts, including some great science communication. How much are DOIs used in Reddit?

(There has since been a discussion about this blog post on Hacker News)

We have a whole strategy for DOI Event Tracking, but nothing beats a quick hack or is more irresistible than a data dump.

DOI Event Tracker (DET): Pilot progresses and is poised for launch

Publishers, researchers, funders, institutions and technology providers are all interested in better understanding how scholarly research is used. Scholarly content has always been discussed by scholars outside the formal literature and by others beyond the academic community. We need a way to monitor and distribute this valuable information.

Real-time Stream of DOIs being cited in Wikipedia

TL;DR

Watch a real-time stream of DOIs being cited (and “un-cited!” ) in Wikipedia articles across the world: http://goo.gl/0AknMJ

Background

For years we’ve known that the Wikipedia was a major referrer of Crossref DOIs and about a year ago we confirmed that, in fact, the Wikipedia is the 8th largest refer of Crossref DOIs. We know that people follow the DOIs, too. This despite a fraction of Wikipedia citations to the scholarly literature even using DOIs. So back in August we decided to create a Wikimedia Ambassador programme. The goal of the programme was to promote the use of persistent identifiers in citation and attribution in Wikipedia articles. We would do this through outreach and through the development of better citation-related tools.