Blog

 2 minute read.

Poorboy Metadata Hack

thammond

thammond – 2009 January 06

In Metadata

I was playing around recently and ran across this little metadata hack. At first, I thought somebody was doing something new. But no, nothing so forward apparently. (Heh! 🙂

I was attempting to grab the response headers from an HTTP request on an article page and was using by default the Perl LWP library. For some reason I was getting metadata elements being spewed out as response headers - at least from some of the sites I tested. With some further investigation I tracked this back to LWP itself which parses HTML headers and generates HTTP pseudo-headers using an X-Meta- style header. (This can be viewed either as a feature of LWP or a bug as this article bemoans.)

What this means anyway is that I can issue a simple call like this to get the HTML metadata - shown here for doi:10.1087/095315108X288947:

``I was playing around recently and ran across this little metadata hack. At first, I thought somebody was doing something new. But no, nothing so forward apparently. (Heh! 🙂

I was attempting to grab the response headers from an HTTP request on an article page and was using by default the Perl LWP library. For some reason I was getting metadata elements being spewed out as response headers - at least from some of the sites I tested. With some further investigation I tracked this back to LWP itself which parses HTML headers and generates HTTP pseudo-headers using an X-Meta- style header. (This can be viewed either as a feature of LWP or a bug as this article bemoans.)

What this means anyway is that I can issue a simple call like this to get the HTML metadata - shown here for doi:10.1087/095315108X288947:

``

This shows a simple (read lazy) means of accessing metadata added as <meta> tags in HTML headers, such as those we added for Nature. (Of course, machine readable metadata is best added using RDFa as noted earlier, but does not preclude also adding in <meta> tags which are also usable with HTML as well as XHTML.)

(Btw, wouldn’t it be fun if Crossref had a random DOI facility? That would be real handy for testing as well as giving users a feel for what real-life DOIs look like and what lies at the other end of them.)

See also:

comments powered by Disqus
RSS Feed

Archives

Last Updated: 2009 January 6 by thammond