Blog

Distributed Usage Logging: A private channel for private data

Jennifer Lin

Jennifer Lin – 2015 December 04

In DataIdentifiersUsage

image 1907 forty wire telephone switchboard

Forty wire telephone switchboard, 1907, Author unknown, Popular Science Monthly Vol 70, Wikimedia Commons.

A few months ago Crossref announced that we will be launching a new service for the community in 2016 that tracks activities around DOIs recording user content interactions. These “events” cover a broad spectrum of online activities including publication usage, links to datasets, social bookmarks, blog mentions, social shares, comments, recommendations, etc. The Event Data service collects the data and make it available to all in an open clearinghouse so that data are open, comparable, audit-able, and portable. These data are all publicly available from external platform partners, and they meet the terms of distribution from each partner.

Auto-Update Has Arrived! ORCID Records Move to the Next Level

Crossref goes live in tandem with DataCite to push both publication and dataset information to ORCID profiles automatically. All organisations that deposit ORCID iDs with Crossref and/or DataCite will see this information going further, automatically updating author records. 

DOIs in Reddit

Skimming the headlines on Hacker News yesterday morning, I noticed something exciting. A dump of all the submissions to Reddit since 2006. “How many of those are DOIs?”, I thought. Reddit is a very broad community, but has some very interesting parts, including some great science communication. How much are DOIs used in Reddit?

(There has since been a discussion about this blog post on Hacker News)

We have a whole strategy for DOI Event Tracking, but nothing beats a quick hack or is more irresistible than a data dump.

DOI Event Tracker (DET): Pilot progresses and is poised for launch

Publishers, researchers, funders, institutions and technology providers are all interested in better understanding how scholarly research is used. Scholarly content has always been discussed by scholars outside the formal literature and by others beyond the academic community. We need a way to monitor and distribute this valuable information.

DOIs and matching regular expressions

We regularly see developers using regular expressions to validate or scrape for DOIs. For modern Crossref DOIs the regular expression is short

/^10.\d{4,9}/[-._;()/:A-Z0-9]+$/i

For the 74.9M DOIs we have seen this matches 74.4M of them. If you need to use only one pattern then use this one.

Rehashing PIDs without stabbing myself in the eyeball

Anybody who knows me or reads this blog is probably aware that I don’t exactly hold back when discussing problems with the DOI system. But just occasionally I find myself actually defending the thing…

January 2015 DOI Outage: Followup Report

Background

On January 20th, 2015 the main DOI HTTP proxy at doi.org experienced a partial, rolling global outage. The system was never completely down, but for at least part of the subsequent 48 hours, up to 50% of DOI resolution traffic was effectively broken. This was true for almost all DOI registration agencies, including Crossref, DataCite and mEDRA.

At the time we kept people updated on what we knew via Twitter, mailing lists and our technical blog at CrossTech. We also promised that, once we’d done a thorough investigation, we’d report back. Well, we haven’t finished investigating all implications of the outage. There are both substantial technical and governance issues to investigate. But last week we provided a preliminary report to the Crossref board on the basic technical issues, and we thought we’d share that publicly now.

Crossref’s DOI Event Tracker Pilot

TL;DR

Crossref’s “DOI Event Tracker Pilot”- 11 million+ DOIs & 64 million+ events. You can play with it at: http://goo.gl/OxImJa

Tracking DOI Events

So have you been wondering what we’ve been doing since we posted about the experiments we were conducting using PLOS’s open source ALM code? A lot, it turns out. About a week after our post, we were contacted by a group of our members from OASPA who expressed an interest in working with the system. Apparently they were all about to conduct similar experiments using the ALM code, and they thought that it might be more efficient and interesting if they did so together using our installation. Yippee. Publishers working together. That’s what we’re all about.

Problems with dx.doi.org on January 20th 2015- what we know.

Hell’s teeth.

So today (January 20th, 2015) the DOI HTTP resolver at dx.doi.org started to fail intermittently around the world. The doi.org domain is managed by CNRI on behalf of the International DOI Foundation. This means that the problem affected all DOI registration agencies including Crossref, DataCite, mEDRA etc. This also means that more popularly known end-user services like FigShare and Zenodo were affected. The problem has been fixed, but the fix will take some time to propagate throughout the DNS system. You can monitor the progress here:

https://www.whatsmydns.net/#A/doi.org

Now for the embarrassing stuff…

♫ Researchers just wanna have funds ♫

photo credit Summary You can use a new Crossref API to query all sorts of interesting things about who funded the research behind the content Crossref members publish. Background Back in May 2013 we launched Crossref’s FundRef service. It can be summarized like this: Crossref keeps and manages a canonical list of Funder Names (ephemeral) and associated identifiers (persistent). We encourage our members (or anybody, really- the list is available under A CC-Zero license waiver) to use this list for collecting information on who funded the research behind the content that our members publish.