Event Data is our service to capture online mentions of Crossref records. We monitor data archives, Wikipedia, social media, blogs, news, and other sources. Our main focus has been on gathering data from external sources, however we know that there is a great deal of Crossref metadata that can be made available as events. Earlier this year we started adding relationship metadata, and over the last few months we have been working on bringing in citations between records.
Tl;dr: Metadata for the (currently 26,000) grants that have been registered by our funder members is now available via the REST API. This is quite a milestone in our program to include funding in Crossref infrastructure and a step forward in our mission to connect all.the.things. This post gives you all the queries you might need to satisfy your curiosity and start to see what’s possible with deeper analysis. So have the look and see what useful things you can discover.
Update on the outage of October 6th. In my blog post on October 6th, I promised an update on what caused the outage and what we are doing to avoid it happening again. This is that update.
Crossref hosts its services in a hybrid environment. Our original services are all hosted in a data center in Massachusetts, but we host new services with a cloud provider. We also have a few R&D systems hosted with Hetzner.
Looking at the road ahead, we’ve set some ambitious goals for ourselves and continue to see new members join from around the world, now numbering 16,000. To help achieve all that we plan in the years to come, we’ve grown our teams quite a bit over the last couple of years, and we are happy to welcome Carlos, Evans, Fabienne, Mike, Panos, and Patrick.
We test a broad sample of DOIs to ensure resolution. For each journal crawled, a sample of DOIs that equals 5% of the total DOIs for the journal up to a maximum of 50 DOIs is selected. The selected DOIs span prefixes and issues.
The results are recorded in crawler reports, which you can access from the depositor report expanded view. If a title has been crawled, the last crawl date is shown in the appropriate column. Crawled DOIs that generate errors will appear as a bold link:
Click Last Crawl Date to view a crawler status report for a title:
The crawler status report lists the following:
Total DOIs: Total number of DOI names for the title in system on last crawl date
Checked: number of DOIs crawled
Confirmed: crawler found both DOI and article title on landing page
Semi-confirmed: crawler found either the DOI or the article title on the landing page
Not Confirmed: crawler did not find DOI nor article title on landing page
Bad: page contains known phrases indicating article is not available (for example, article not found, no longer available)
Login Page: crawler is prompted to log in, no article title or DOI
Exception: indicates error in crawler code
httpCode: resolution attempt results in error (such as 400, 403, 404, 500)
httpFailure: http server connection failed
Select each number to view details. Select re-crawl and enter an email address to crawl again.
Page owner: Isaac Farley | Last updated 2020-April-08