« Does Size Matter? | Main | Search Web Services - New Committee Drafts »

Five Years

Oh wow! A rather remarkable plea here from Dan Brickley on the public-lod mailing list which calls for the registrant of the dbpedia.org DNS entry to top it up with another 5+ years worth of clocktime. Some quotes:

"The idea of such a cool RDF namespace having only 6 months left on the DNS registration gives me the worries."

"If you could add another 5-10 years to the DNS registration I'd sleep easier at night."

"Let me stress I'm not suggesting that this domain is actually at risk. Just that the not-at-risk-ness isn't readily evident from a quick look in the DNS."

"Those in the know are probably confident this is all in hand, but as the SW gets bigger I suspect we ought to establish practices such as "vocabularies that seek global adoption should always have 5+ years on their DNS registries"."

Yes, and maybe those cool URIs should have kite marks, too. ;)

(Btw, for those who may not already know the maximum length of time that any DNS name may be leased out in a single registration is 10 years, see the FAQ put out by ICANN.)

So, pity the poor user of a given semantic web application who may not know what the expectancy is behind the nodes in an RDF graph of assertions. Shifting sands, indeed.

Comments

The shifting sands aren't a bug, they're a feature :-)

@Ed: So it's to be regarded as a feature when the name authority for a resource disappears? Oh my. ;)

Seriously though, Dan's post does highlight the the notion of valued resources - in his case, groovy vocab terms. We also have that same issue very close to heart with the DOI which is regularly assigned to scholarly articles which are deemed to have an intrinsic value and thus are in need of some protection.

My reason for commenting on Dan's post was that here we have a prominent commentator on semantic web development who recognizes a key vulnerability - could one say it is the Achilles' heel? - of URIs as identifiers: the name authority component.

Tony, you should know better than to conflate "URIs as identifiers" with "URIs based on the DNS" with "http[s]://" URIs. There are plenty of URIs that don't use DNS. There are even non-HTTP DNS-based identifiers like tag: who make interesting use of time-indexed DNS state. Your issue here is really with http/https, I think.

Any deployable name authority system is vulnerable to mis-management by its users. Perhaps someone assigns IDs based on the lunch-break experiments, without reading their contract of employment. Perhaps they forget the passwords and get hit by a bus. Perhaps an org is folded into another, and its metadata projects shut down clumsily by a lazy middle manager.

There are lots of ways to screw up. The core technology choices we make, set things up so that different classes of screwup are easier and harder. Using http and https URIs for metadata vocabulary naming makes many things easier, but creates also risks that we need to be aware of.

When app authors choose metadata vocabs, there are many things to take into account. The vocab naming policy and mechanisms for long term availability are some of those. I'm working on some tools to help with this...

If others prefer to use DOIs, that's fine too. But even DOIs can perfectly well be shipped around as URIs, so to point this as a flaw in the URI system is a little misleading.

Dan, of course as you point out I am talking about DNS-based HTTP/S URIs. Other schemes and IP-based URIs are largely confined to fringe cases. AWWW, 2.2.2.1 URI ownership has this this to say about onward delegation of authority:

"The approach taken for the "http" URI scheme, for example, follows the pattern whereby the Internet community delegates authority, via the IANA URI scheme registry and the DNS, over a set of URIs with a common prefix to one particular owner. One consequence of this approach is the Web's heavy reliance on the central DNS registry."

And the therein referenced TAG issue siteData-36 goes on to say this:

"The architecture of the web is that the space of identifiers on an http web site is owned by the owner of the domain name."

Clearly with the Web at large we are talking plain old URLs with the "owner's" name lit up front in the auth component (whether there's anything to be got from that address or not).

What I can't remember having seen in AWWW or other docs (e.g. RDF specs, SKOS or Linked Data docs) is any mention of this vulnerability to persistence that DNS-based URIs exhibit. (I may be wrong. It may be dealt with somewhere.) You yourself anyway drew attention to the "at-risk-ness" of the DNS namespace. My point was not that HTTP URIs are bad or that DOI is any way better but rather that there is a fairly general belief that URLs are pretty stable things if managed carefully and that once somebody procures (er, leases) a DNS auth name they can churn out semantic terms till the cows come home. Very clearly this is not the case as you advert to in your post. DNS-based namespace authorities are necessarily transient and thus need to be maintained if the URIs built on top of them are to have a wide utility in space and time. Strangely this does not seem to be articulated anywhere. I found your post to be valuable in drawing attention to this aspect (durability) of naming semantic terms and would only like to see a more formal recognition of this very real problem.

Hey Tony, I tried to comment more usefully over here. I wasn't sure if the track-back would work.

Post a comment

Verification (needed to reduce spam):