« December 2008 | Main | February 2009 »

January 19, 2009

CURIE Syntax 1.0

The W3C has recently (Jan. 16) released CURIE Syntax 1.0 as a Candidate Recommendation and is inviting implementations.

(Note that I made a fuller post here on CURIEs and erroneously confused the Editor's Draft (Oct. 23, '08) as being a Candidate Recommendation. Well, at least it's got there now.)

January 17, 2009

Standard InChI Defined

IUPAC has just released the final version (1.02) of its InChI software, which generates Standard InChIs and Standard InChIKeys. (InChI is the IUPAC International Chemical Identifier.)

The Standard InChI "removes options for properties such as tautomerism and stereoconfiguration", so that a molecule will always generate the same stable identifier - a unique InChI - which facilitates "interoperability/compatibility between large databases/web searching and information exchange". Note also that any "shortcomings in Standard InChI may be addressed using non-Standard InChI (currently obtainable using InChI version 1.02beta)".

On a practical level this means that the 27-character length InChIKeys (a hashed form of the InChI), with the following generic form

AAAAAAAAAAAAAA-BBBBBBBBFV-P
can now be readily and reliably generated and will start to be used in search indexing and linking applications.

January 16, 2009

XMP Library for Flash

Update about new XMP Library from Adobe Labs:

"The new Adobe XMP Library for ActionScript is now available for download on Adobe Labs. Adobe Extensible Metadata Platform (XMP) is a labeling technology that allows you to embed data about a file, known as metadata, into the file itself. XMP is an open technology based on RDF and RDF/XML. With this new library you can read existing XMP metadata from Flash based file formats via the Adobe Flash Player."
Any volunteers?

January 06, 2009

Poorboy Metadata Hack

I was playing around recently and ran across this little metadata hack. At first, I thought somebody was doing something new. But no, nothing so forward apparently. (Heh! :)

I was attempting to grab the response headers from an HTTP request on an article page and was using by default the Perl LWP library. For some reason I was getting metadata elements being spewed out as response headers - at least from some of the sites I tested. With some further investigation I tracked this back to LWP itself which parses HTML headers and generates HTTP pseudo-headers using an X-Meta- style header. (This can be viewed either as a feature of LWP or a bug as this article bemoans.)

What this means anyway is that I can issue a simple call like this to get the HTML metadata - shown here for doi:10.1087/095315108X288947:

% lwp-request -ed 'http://dx.doi.org/10.1087/095315108X288947' | grep -i x-meta
X-Meta-DC.Creator: Rapple, Charlie
X-Meta-DC.Identifier: info:doi/10.1087/095315108X288947
X-Meta-DC.Publisher: Association of Learned and Professional Society Publishers
X-Meta-DC.Title: Knowledge bases: improving the information supply chain
X-Meta-DC.Type: Text
X-Meta-DCTERMS.BibliographicCitation: Learned Publishing, 21, 2, 110-115(6)
X-Meta-DCTERMS.IsPartOf: urn:ISSN:0953-1513
X-Meta-DCTERMS.Issued: April 2008
X-Meta-IC.Identifier: alpsp/lp/2008/00000021/00000002/art00005

This shows a simple (read lazy) means of accessing metadata added as <meta> tags in HTML headers, such as those we added for Nature. (Of course, machine readable metadata is best added using RDFa as noted earlier, but does not preclude also adding in <meta> tags which are also usable with HTML as well as XHTML.)

(Btw, wouldn't it be fun if CrossRef had a random DOI facility? That would be real handy for testing as well as giving users a feel for what real-life DOIs look like and what lies at the other end of them.)