admin – 2010 August 03

In IdentifiersPDFXMPInChI

Just a quick heads-up to say that we’ve had a go at incorporating InChIs and ontology terms into our PDFs with XMP. There isn’t a lot of room in an XMP packet so we’ve had to be a bit particular about what we include.

  • InChIs: the bigger the molecule the longer the InChI, so we’ve standardized on the fixed-length InChIKey. This doesn’t mean anything on its own, so we’ve gone the Semantic Web route of including an InChI resolver HTTP URI. Alternatively you can extract the InChIKeys with a regular expression.
  • Ontology terms: we’re using HTTP URIs again and pointing to either Open Biomedical Ontology URIs (biology, biomedicine; slashy) or RSC ontology terms (chemistry; hashy). Often the OBO URIs resolve to a specific web page, but for the moment the RSC URIs just point to a large OWL file. Slashy URIs are quite a bit more involved so we’ll have to see what the demand is like.

There’s only about 4K to play with, so it’s only ever going to be a best-of. More detailed article metadata has to go in either a sidecar file, as Tony has pointed out before, or ideally on the article landing page. The example files are here and I’ve posted something with a different slant on the RSC technical blog.

