3 minute read.
Publishing Linked Data
With these words:
_“There was quite some interest in Linked Data at this year’s World Wide
Web Conference (WWW2007). Therefore, Richard Cyganiak, Tom Heath and I
decided to write a tutorial about how to publish Linked Data on the
Web, so that interested people can find all relevant information, best
practices and references in a single place.”_
Chris Bizer announces this draft How to Publish Linked Data on the Web. It’s a bright and breezy tutorial and useful (to me, anyway) for disclosing a couple of links:
- Findings of the W3C TAG
The tutorial is unsurprisingly orthodox in its advocacy for all things HTTP and goes on to say:
“In the context of Linked Data, we restrict ourselves to using HTTP URIs only and avoid other URI schemes such as URNs and DOIs.”
But this only relates back to Berners-Lee’s piece on Linked Data referenced above in which he says:
“The second rule, to use HTTP URIs, is also widely understood. The only deviation has been, since the web started, a constant tendency for people to invent new URI schemes (and sub-schemes within the urn: scheme) such as LSIDs and handles and XRIs and DOIs and so on, for various reasons. Typically, these involve not wanting to commit to the established Domain Name System (DNS) for delegation of authority but to construct something under separate control. Sometimes it has to do with not understanding that HTTP URIs are names (not addresses) and that HTTP name lookup is a complex, powerful and evolving set of standards. This issue discussed at length elsewhere, and time does not allow us to delve into it here.”
Hmm. Does make one wonder where the concept of URI ever arose. Surely the nascent WWW application should have mandated the exclusive use of HTTP identifiers? Seems that this concept snuck up on us somehow and we now have to put it back into the box. Pandora, indeed!
Back to the tutorial there are some unorthodox terms or at least I had not heard of them before. Contrasted with the defined term information resources (from AWWW) is the undefined term “non-information resources”. Further on, there’s a distinction made between two types of RDF triple: “literal triples” and “RDF links”. I hadn’t heard of either of these terms before although they are presented as if they were in common usage. The tutorial then goes on to deprecate the use of certain RDF features because it makes it “easier for clients”. So, I guess that the full expressivity of RDF is either not required or the world of “linked data” is not quite so large as it would like to be.
And later on, there’s this puzzling injunction:
“You should only define terms that are not already defined within well-known vocabularies. In particular this means not defining completely new vocabularies from scratch, but instead extending existing vocabularies to represent your data as required.”
Am I wrong, or is there something of a Catch 22 there? To extend an arbitrary vocabulary I would need to be the namespace authority - to be the “URI owner” in W3C speak. But I can’t be the authority for all namespaces/vocabularies because by the intent of the above they would likely be just the one (true?) vocabulary which I may or may not be the authority for. I thought the intent of the RDF model and XML namespaces was that terms could be applied from disparate vocabularies to the description at hand.
Anyways, I am not trying to knock the draft. It’s something of a curate’s egg, that’s true, but I am genuinely looking forward to reading it through and would encourage others to have a look at it too.