3 minute read.

Exposing Public Data


thammond – 2008 May 31

In Discussion

As the range of public services (e.g. RSS) offered by publishers has matured this gives rise to the question: How can they expose their public data so that a user may discover them? Especially, with DOI there is now in place a persistence link infrastructure for accessing primary content. How can publishers leverage that infrastructure to advantage?

Anyway, I offer this figure as to how I see the current lie of the land as regards DOI services and data.

Legend - Current DOI service architecture showing data repositories, service access points, and open/closed data domains.

The figure above shows the three data repositories and service access points in the current DOI services architecture. At right and bottom of the figure are the two types of service (public services and private services) that together are instrumental in getting a user from a DOI-based link (on a third-party site) to the correct page of content (from the primary content provider). (Note that a fourth, private data repository – the institutional repository – comes into play when OpenURL user context-sensitive linking is added.)

At left of the figure are services operated by Crossref on its own metadata database which support a) publisher lookups of DOI, and b) third-party metadata services (DOI-to-metadata and metadata-to-DOI conversions). These might best be labelled protected services since they are not freely available: the first is open to members at a cost, while the second is free but to associated organizations only – members, affiliates, etc.

The term open data is used here in the sense implied by the current W3C SWEO LOD (Linking Open Data) Project. Open data is public data unencumbered by any access restrictions. By contrast, closed data is data that has some access restrictions placed on it – even data that is open to affiliates. (This is not an issue that LOD addresses directly, although it is implied that data is globally ‘open’, i.e. public.)

The current DOI service architecture thus breaks down as:

  • Native DOI services – resolving the DOI token
    • Public – DOI Proxy Server (‘’)
    • Related DOI services – using the DOI token
      • Protected – Crossref
        • Private – Publisher

      Note that a DOI is ‘resolved’ into state data registered with it, or as ISO CD 26324 puts it: “Resolution is the process of submitting a specific DOI name to the DOI system and receiving in return the associated values held in the DOI resolution record for one or more types of data relating to the object identified by that DOI name.”

      So, how might publishers best leverage this DOI service architecture to expose their public data?

Related pages and blog posts

Page owner: thammond   |   Last updated 2008-May-31