Crossref Labs is happy to announce the first public release of “pdf-extract” an open source set of tools and libraries for extracting citation references (and, eventually, other semantic metadata) from PDFs. We first demonstrated this tool to Crossref members at our annual meeting last year. See the pdf-extract labs page for a detailed introduction to this new set of tools.
If you are unable to download and install the tool, you can play with a experimental web interface called “Extracto.” Be warned, Extracto is running on very feeble server using an erratic and slow internet connection. The only guarantee that we can make about using it is that it will repeatedly fall over and annoy you. The weasel has spoken.