Crossref Labs is happy to announce the first public release of “pdf-extract” an open source set of tools and libraries for extracting citation references (and, eventually, other semantic metadata) from PDFs. We first demonstrated this tool to Crossref members at our annual meeting last year. See the pdf-extract labs page for a detailed introduction to this new set of tools.
If you are unable to download and install the tool, you can play with a experimental web interface called “Extracto.” Be warned, Extracto is running on very feeble server using an erratic and slow internet connection. The only guarantee that we can make about using it is that it will repeatedly fall over and annoy you. The weasel has spoken.
PHD Comics has posted its Valentine’s Day Reading list. Without DOIs! So in order to preserve the scholarly citation record, we’ve resolved those that have DOIs…. Title: The St. Valentine’s Day Frontal Passage Citation: Sassen, K, 1980, ‘The St. Valentine’s Day Frontal Passage’, Bulletin of the American Meteorological Society, vol. 61, no. 2, p. 122. Crossref DOI: http://dx.doi.org/10.1175/1520-0477(1980)0612.0.CO;2 Title: SUICIDE AND HOMICIDE ON ST. VALENTINE’S DAY Citation: LESTER, D, 1990, ‘SUICIDE AND HOMICIDE ON ST.
In April In April for its DOIs. At the time I cheekily called-out DataCite to start supporting content negotiation as well.
Edward Zukowski (DataCite’s resident propellor-head) took up the challenge with gusto and, as of September 22nd DataCite has also been supporting content negotiation for its DOIs. This means that one million more DOIs are now linked-data friendly. Congratulations to Ed and the rest of the team at DataCite.
We hope this is a trend.
So does anybody remember the posting DOIs and Linked Data: Some Concrete Proposals?
Well, we went with option “D.”
From now on, DOIs, expressed as HTTP URIs, can be used with content-negotiation.
Let’s get straight to the point. If you have curl installed, you can start playing with content-negotiation and Crossref DOIs right away:
curl -D - -L -H “Accept: application/rdf+xml” “http://dx.doi.org/10.1126/science.1157784”
curl -D - -L -H “Accept: text/turtle” “http://dx.
While working on an internal project, we developed “pdfstamp“, a command-line tool that allows one to easily apply linked images to PDFs. We thought some in our community might find it useful and have released it on github. Some more PDF-related tools will follow soon.
Since last month’s threads (here, here, here and here) talking about the issues involved in making the DOI a first-class identifier for linked data applications, I’ve had the chance to actually sit down with some of the thread’s participants (Tony Hammond, Leigh Dodds, Norman Paskin) and we’ve been able sketch-out some possible scenarios for migrating the DOI into a linked data world.
I think that several of us were struck by how little actually needs to be done in order to fully address virtually all of the concerns that the linked data community has expressed about DOIs.
Tony’s recent thread on making DOIs play nicely in a linked data world has raised an issue I’ve meant to discuss here for some time- a lot of the thread is predicated on the idea that Crossref DOIs are applied at the abstract “work” level. Indeed, that it what it currently says in our guidelines. Unfortunately, this is a case where theory, practice and documentation all diverge.
When the Crossref linking system was developed it was focused primarily on facilitating persistent linking amongst journals and conference proceedings.
Was outraged (outraged, I tell you) that one of my favorite online comics, PhD, didn’t include DOIs in their recent bibliography of Christmas-related citations.. So I’ve compiled them below.
We care about these things so that you don’t have to. Bet you will sleep better at night knowing this.
Or perhaps not…
A Christmas Reading List… with DOIs. Citation: Biggs, R, Douglas, A, Macfarlane, R, Dacie, J, Pitney, W, Merskey, C & O’Brien, J, 1952, ‘Christmas Disease’, BMJ, vol.
In order to encourage publishers and other content producers to embed metadata into their PDFs, we have released an experimental tool called “pdfmark”, This open source tool allows you to add XMP metadata to a PDF. What’s really cool, is that if you give the tool a Crossref DOI, it will lookup the metadata in Crossref and then apply said metadata to the PDF. More detail can be found on the pdfmark page on the Crossref Labs site.
Inspired by Google’s recent promotion of QR Codes, I thought it might be fun to experiment with encoding a Crossref DOI and a bit of metadata into one of the critters. I’ve put a short write-up of the experiment on the Crossref Labs site, which includes a demonstration of how you can generate a QR Code for any given Crossref DOI. Put them on postcards and send them to your friends for the holidays.