Describing Resource Sets: ORE vs POWDER

I’ve been reading up on POWDER recently (the W3C Protocol for Web Description Resources) which is currently in last call status (with comments due in tomorrow). This is an effort to describe groups of Web resources and as such has clear similarities to the Open Archives Initiative ORE data model, which has been blogged about here before.
In an attempt to better understand the similarities (and differences) between the two data models, I’ve put up the table

A Comparison of Description Mechanisms for URI Collections

ore-powder-fragment-30.jpg

which directly compares the two heavyweight contendors OAI-ORE and POWDER and also (unfairly) places them alongside the featherweight Sitemaps Protocol for reference.
This is very much a draft document and I will aim to update the table based on my own further reading and on any feedback that I may get (contributions gratefully received). I’m all too aware that my understanding of the respective data models is painfully limited and I, for one, hope to profit through this exercise. There will be certainly errors which I will aim to fix as soon as I get wind of them. :)
By the way, the ORE work especially is of interest to CrossRef members and has obvious synergies with the multiple resolution potential that DOI has long promised but not quite delivered on.

Creative Commons License
Describing Resource Sets: ORE vs POWDER by Tony Hammond, unless otherwise expressly stated, is licensed under a Creative Commons Attribution 4.0 International License.

8 thoughts on “Describing Resource Sets: ORE vs POWDER

  1. Ed Summers

    Incredibly useful–thanks Tony. I’ve had it in my head that oai-ore and powder were similar, but seeing sitemaps thrown into the mix is helpful as a ground.
    You indicate that oai-ore’s use of an HTTP URI to identify aggregations is a ‘con’. I’d be curious to hear why.
    Also, I think a better explanation of the cons for sitemaps is that the aggregation being described is an entire website…not some subset thereof. I guess this is implied in what you have currently … but in my opinion it’s not stated forcefully enough :-)
    Also, just out of curiosity have you seen DERI’s sitemap extension http://sw.deri.org/2007/07/sitemapextension/ ?

  2. Tony Hammond

    Thanks for that Ed. Glad to know it may have been of some use.
    The HTTP URI thing is just my pet peeve. I don’t see why ORE should dicatate how I am to name my aggregations. The reasons for an HTTP-based, or rather the more general “protocol-based” URI for aggregations is to serve for discovery purposes to infer the URI of the Resource Map. That’s useful, certainly, but I just don’t happen to like it. Me being bolshy, I guess.
    Take your point about sitemaps. And yes, I just checked that sitemap indexes are also subject to the same domain constraint. (“A Sitemap index file can only specify Sitemaps that are found on the same site as the Sitemap index file.”)
    Thanks for the DERI sitemap extension doc. Have printed, will read.

  3. Ed Summers

    FWIW, I think I’ve seen some ore resource maps from Mark Diggory that use handle info-uris. http://groups.google.com/group/oai-ore/browse_thread/thread/ebf53186ae09b8d4
    I actually didn’t know that you were *required* to use HTTP URIs.
    “””
    A URI-A MUST be a protocol-based URI. However, an Aggregation is a conceptual construct, and thus it does not have a Representation. In contrast, a Resource Map that asserts the Aggregation does have a Representation in which that assertion is made available to clients and agents. The Cool URIs for the Semantic Web guidlines are adopted to support discovery of the HTTP URI of the asserting Resource Map given the HTTP URI of an Aggregation. Details about the mechanisms of access are described in ORE User Guide – HTTP Implementation.
    “””
    Does ‘protocol based URI’ mean only HTTP URIs then?

  4. Ed Summers

    FWIW, I think I’ve seen some ore resource maps from Mark Diggory that use handle info-uris. http://groups.google.com/group/oai-ore/browse_thread/thread/ebf53186ae09b8d4
    I actually didn’t know that you were *required* to use HTTP URIs.
    “””
    A URI-A MUST be a protocol-based URI. However, an Aggregation is a conceptual construct, and thus it does not have a Representation. In contrast, a Resource Map that asserts the Aggregation does have a Representation in which that assertion is made available to clients and agents. The Cool URIs for the Semantic Web guidlines are adopted to support discovery of the HTTP URI of the asserting Resource Map given the HTTP URI of an Aggregation. Details about the mechanisms of access are described in ORE User Guide – HTTP Implementation.
    “””
    Does ‘protocol based URI’ mean only HTTP URIs then?

  5. Tony Hammond

    Hi Ed:
    Ah, that’s exactly my point. A data publisher should be able to use their own name for an aggregation as in the resource map examples you cited which use a handle.
    I was a little slapdash in the table referring to HTTP URIs (which by the way the ORE model largely presumes). In fact, as you point out, ORE requires “protocol-based” URIs whih are intended to be URIs built upon a network protocol, in other words plain old fashioned URLs. The intent (as I understand it) is to use client-supported mechanisms to discover resource map URIs from aggregation URIs, e.g. using HTTP 303 redirects or hash URIs. The latter especially is a horrid kludge which builds on the fact that a fragment identifier is retained for local processsing and not submitted to the network in a retrieval op, i.e. a broswer would strip the fragment before attempting to resolve the hash URI. Therefore the reasoning goes that thet hash URI could be iused for the aggregation and the network clean version for the GET-able resource map. (Thee wording about semantics of fragment identifiers in RFC 3986 is wonderfully obscure and would support more or less any interpretation.)
    Myself, I would have preferred a cleaner separation between the two URIs and alow data publishers freedom in naming. (Should be noted that POWDER doesn’t attempt to lay similar restrictions on data providers.)

  6. Ed Summers

    I wasn’t part of the OAI-ORE discussion, but I don’t see why handles couldn’t be deemed a protocol-based. There is afterall a means for de-referencing them right? If there is a specific section of the ORE documentation that says otherwise I’d appreciate a pointer.

  7. Tony Hammond

    Well, Ed, I think the intent was for URI schemes that mapped 1:1 to network protocols (e.g. http:, ftp:, ldap:, …) which may find support in user agents. Handle would fit right in there, if it defined its own URI scheme – which so far it has chosen not to do so.
    The reference you make is to the “info:” scheme. See this passage from the RFC:

    “The “info” URI scheme exists primarily for identification purposes. Implementations MUST NOT assume that an “info” URI can be dereferenced to a representation of the resource identified by the URI although Namespace Authorities MAY disclose in the registration record references to service mechanisms pertaining to identifiers from the registered namespace.”

    So, the “info:” URI scheme is not tied 1:1 to a network protocol. It should be noted though that the “hdl” namespace does report a service mechanism (i.e. “http://hdl.handle.net/”) in its registration record.
    I think I am guilty of coining the phrase “protocol-based” URIs in an attempt to widen the scope beyond HTTP URIs. That should have defined more clearly in the specs. I don’t see any ready reference to that definition. You would have to ask the authors of the specs, though.
    Also, as noted earlier:

    • I would not have restricted aggregation URIs to be protocol-based,
    • and it would be very helpful for handle users if there were a toplevel URI scheme defined for handle.
  8. Rob Sanderson

    Protocol-based means, essentially, that the URI has a transport protocol as well as an identification scheme. eg info: URIs do not have a transport protocol, where as http, https, and ftp do.
    And indeed, it is to ensure that it’s not just http URIs, but URIs that can be dereferenced to a resource. If there’s a client that you can type the URI into and it uses the protocol to fetch the resource being identified, then it’s good.
    As Tony says, the reason for this is discovery. Linking to resources has proven to be MUCH more successful than the closed garden approach of registries, registries and more registries.

Comments are closed.