Blog

Oh No, Not You Again!

thammond – 2007 October 02

Oh dear. Yesterday’s post “Using ISO URNs” was way off the mark. I don’t know. I thought that walk after lunch had cleared my mind. But apparently not. I guess I was fixing on eyeballing the result in RDF/N3 rather than the logic to arrive at that result.

(Continues.)

There are three namespace cases (and I was only wrong in two out of the three, I think):

1. “pdf:”

I was originally going to suggest the use of “data:” for the PDF information dictionary terms here but then lunged at using an HTTP URI (the URI of the page for the PDF Reference manual on the Adobe site) for regular orthodox conformancy and good churchgoing:

@prefix pdf: <http://www.adobe.com/devnet/pdf/pdf_reference.html> .


This was wrong on two counts:

a) Afaik no such use for this URI as a namespace has ever been made by Adobe. And it is in the gift of the DNS tenant (elsewhere called “owner”) to mint URIs under that namespace and to ascribe meanings to those URIs.

b) Also the URI is not best suited to a role as namespace URI since RDF namespaces typically end in “/” or “#” to make the division between namespace and term clearer. (In XML it doesn’t make a blind bit of difference as XML namespaces are just a scoping mechanism.) So to have a property URI as

http://www.adobe.com/devnet/pdf/pdf_reference.htmlAuthor


does the job but looks pretty rough and more importantly precludes (at least, complicates) the possibility of dereferencing the URI to return a page with human or machine readable semantics. Better in RDF terms is one of the following:

a) http://www.adobe.com/devnet/pdf/pdf_reference/Author


In the absence of any published namespace from Adobe for these terms, I think it would have been more prudent to fall back on “data:” URIs. So

@prefix pdf: <data:,> .


data:,Author
data:,CreationDate
data:,Creator
etc.


This is correct (afaict) and merely provides a URI representation for bare strings.

Had we wanted to relate those terms to the PDF Reference we might have tried something like:

data:,PDF%20Reference:Author
data:,PDF%20Reference:CreationDate
data:,PDF%20Reference:Creator
etc.


And if we had wanted to make those truly secondary RDF resources related to a primary RDF resource for the “namespace” we could have attempted something like:

data:,PDF%20Reference#Author
data:,PDF%20Reference#CreationDate
data:,PDF%20Reference#Creator
etc.


Note though that the “data:” specification is not clear about the implications of using “#”. (Is it allowed, or isn;t it?) We must suspect that it is not allowed, but see this mail from Chris Lilley (W3C) which is most insightful.

1. “pdfx:”

The example was just for demo purposes, but (as per 1a above) it is incumbent on the namespace authority (here ISO) to publish a URI for the term to be used. Anyhow, the namespace URI I cited

@prefix pdfx: <urn:iso:std:iso-iec:15930:-1:2001> .


would not have been correct and would have led to these mangled URIs:

urn:iso:std:iso-iec:15930:-1:2001GTS_PDFXVersion
urn:iso:std:iso-iec:15930:-1:2001GTS_PDFXConformance


It should have been something closer to

@prefix pdfx: <urn:iso:std:iso-iec:15930:-1:2001:> .


urn:iso:std:iso-iec:15930:-1:2001:GTS_PDFXVersion
urn:iso:std:iso-iec:15930:-1:2001:GTS_PDFXConformance

1. “_usr:”

This was the one correct call in yesterday’s post.

@prefix _usr: <data:,> .


The only problem here would be to differentiate these terms from the terms listed in the PDF Reference manual, although the PDF information dictionary makes no such distinction itself.

To sum up, perhaps the best way of rendering the PDF information dictionary keys in RDF would be to use “data:” URIs for all (i.e. a methodology for URI-ifying strings) and to bear in mind that at some point ISO might publish URNs for the PDF/X mandated keys: ‘GTS_PDFXVersion‘ and ‘GTS_PDFXConformance‘. So,

# document infodict (object 58: 476983):
@prefix: pdfx:  <data:,> .
@prefix: pdf:  <data:,> .
@prefix: _usr: <data:,> .
<>   _usr:Apag_PDFX_Checkup "1.3";
pdf:Author "Scott B. Tully";
pdf:CreationDate "D:20020320135641Z";
pdf:Creator "Unknown";
pdfx:GTS_PDFXConformance "PDF/X-1a:2001";
pdfx:GTS_PDFXVersion "PDF/X-1:2001";
pdf:Keywords "PDF/X-1";
pdf:ModDate "D:20041014121049+10'00'";
pdf:Producer "Acrobat Distiller 4.05 for Macintosh";
pdf:Subject "A document from our PDF archive. ";
pdf:Title "Tully Talk November 2001";
pdf:Trapped "False" .