November 28, 2011

Turning DOIs into formatted citations

Today two new content types were added to dx.doi.org resolution for CrossRef DOIs. These allow anyone to retrieve DOI bibliographic metadata as formatted bibliographic entries. To perform the formatting we're using the citation style language processor, citeproc-js which supports a shed load of citation styles and locales. In fact, all the styles and locales found in the CSL repositories, including many common styles such as bibtex, apa, ieee, harvard, vancouver and chicago are supported.

First off, if you'd like to try citation formatting without using content negotiation, there's a simple web UI that allows input of a DOI, style and locale selection.

If you're more into accessing the web via your favorite programming language, have a look at these content negotiation curl examples. To make a request for the new "text/bibliography" content type:

$ curl -LH "Accept: text/bibliography; style=bibtex" http://dx.doi.org/10.1038/nrd842

@article{Atkins_Gershell_2002, title={From the analyst's couch: Selective anticancer drugs}, volume={1}, DOI={10.1038/nrd842}, number={7}, journal={Nature Reviews Drug Discovery}, author={Atkins, Joshua H. and Gershell, Leland J.}, year={2002}, month={Jul}, pages={491-492}}

A locale can be specified with the "locale" content type parameter, like this:

$ curl -LH "Accept: text/bibliography; style=mla; locale=fr-FR" http://dx.doi.org/10.1038/nrd842

Atkins, Joshua H., et Leland J. Gershell. « From the analyst's couch: Selective anticancer drugs ». Nature Reviews Drug Discovery 1.7 (2002): 491-492.

You may want to process metadata through CSL yourself. For this use case, there's another new content type, "application/citeproc+json" that returns metadata in a citeproc-friendly JSON form:

$ curl -LH "Accept: application/citeproc+json" http://dx.doi.org/10.1038/nrd842

{"volume":"1","issue":"7","DOI":"10.1038/nrd842","title":"From the analyst's couch: Selective anticancer drugs","container-title":"Nature Reviews Drug Discovery","issued":{"date-parts":[[2002,7]]},"author":[{"family":"Atkins","given":"Joshua H."},{"family":"Gershell","given":"Leland J."}],"page":"491-492","type":"article-journal"}

Finally, to retrieve lists of supported styles and locales, either hit these URLs:

or check out the CSL style and locale repositories.

There's one big caveat to all this. The CSL processor will do its best with CrossRef metadata which can unfortunately be quite patchy at times. There may be pieces of metadata missing, inaccurate metadata or even metadata items stored under the wrong field, all resulting in odd-looking formatted citations. Most of the time, though, it works.

November 22, 2011

Determining the CrossRef membership status of a domain

We've been asked a few times if it is possible to determine whether or not a particular domain name belongs to a CrossRef member. To address this we're launching another small service that performs something like a "reverse look-up" of URLs and domain names to DOIs and CrossRef member status.

The service provides an API that will attempt to reverse look-up a URL to a DOI and return the membership status (member or non-member) of the root domain of the URL. In practice resolving URLs to DOIs has substantial limitations - many publishers redirect the resolution URL of DOIs to other online content and URLs become clogged up with session IDs and other cruft appearing in their query parameters. All of this means it is unlikely that the URLs that appear to be the end result of DOI resolution are actually the URLs pointed to.

However, it's also possible to provide only a host name, in which case, as with a URL, the CrossRef membership status for the root domain will be returned.

There's also a downloadable list of hashed domains that belong to CrossRef members which will be useful to those who want to determine the membership status of a domain locally. Also, a bookmarklet allows anyone to easily check a web page they are looking at to see if the domain it is hosted on belongs to a CrossRef member.

Check it out over at the documentation page.

October 10, 2011

DataCite supporting content negotiation

In April CrossRef launched content negotiation support for its DOIs. At the time I cheekily called-out DataCite to start supporting content negotiation as well.

Edward Zukowski (DataCite's resident propellor-head) took up the challenge with gusto and, as of September 22nd DataCite has also been supporting content negotiation for its DOIs. This means that one million more DOIs are now linked-data friendly. Congratulations to Ed and the rest of the team at DataCite.

We hope this is a trend. Back in June Knowledge Exchange organized a seminar on Persistent Object Identifiers. One of the outcomes of the meeting was "Den Haag Manifesto" a document outlining five relatively simple steps that different persistent identifier systems could take in order to increase interoperability. Most of these steps involved adopting linked data principles including support for content negotiation. We look forward to hearing about other persistent identifiers adopting these principles over the next year.

Having said that, this time I will refrain from calling-out anybody specifically...

Enhanced by Zemanta

October 6, 2011

Family Names Service

Today I'm announcing a small web API that wraps a family name database here at CrossRef R&D. The database, built from CrossRef's metadata, lists all unique family names that appear as contributors to articles, books, datasets and so on that are known to CrossRef. As such the database likely accounts for the majority of family names represented in the scholarly record.

The web API comes with two services: a family name detector that will pick out potential family names from chunks of text and a family name autocompletion system.

Very brief documentation can be found here along with a jQuery example of autocompletion.

The database is still in development so there may be some oddities and inaccuracies in there. Right now one obvious omission from the name list that I hope to address soon are double-worded names such as "von Neumann". We're not proposing this database as an authority but rather something that backs a practical service for family name detection and autocompletion.

April 19, 2011

Content Negotiation for CrossRef DOIs

So does anybody remember the posting DOIs and Linked Data: Some Concrete Proposals?

Well, we went with option "D."

From now on, DOIs, expressed as HTTP URIs, can be used with content-negotiation.

Let's get straight to the point. If you have curl installed, you can start playing with content-negotiation and CrossRef DOIs right away:

curl -D - -L -H   "Accept: application/rdf+xml" "http://dx.doi.org/10.1126/science.1157784" 

curl -D - -L -H   "Accept: text/turtle" "http://dx.doi.org/10.1126/science.1157784"

curl -D - -L -H   "Accept: application/atom+xml" "http://dx.doi.org/10.1126/science.1157784"

Or if you are already using CrossRef's "unixref" format:

curl -D - -L -H "Accept: application/unixref+xml" "http://dx.doi.org/10.1126/science.1157784" 

This will work with over 46 million CrossRef DOIs as of today, but the beauty of the setup is that from now on, any DOI registration agency can enable content negotiation for their constituencies as well. DataCite- we're looking at you ;-) .

It also means that, as registration agency members (CrossRef publishers, for instance) start providing more complete and richer representations of their content, we can simply redirect content-negotiated requests directly to them.

We expect that that this development will round-out CrossRef's efforts to support standard APIs including OpenURL and OAI_PMH and we look forward to seeing DOIs increasingly used in linked data applications.

Finally, CrossRef would just like to thank the IDF and CNRI for their hard work on this as well as Tony Hammond and Leigh Dodds for their valuable advice and persistent goading.







Recently Commented On

Powered by
Movable Type 5.04