Friday, 16 April 2010

Permanent URLs, Addresses and Names

I found a link to an article by Taylor Cowan about persistent URLs on the web. It was mostly about what happens to metadata assertions (such as RDF statements) when links break, but there was a little something on persistent links and URNs, too. A comparison with Amazon.com and how books are referenced these days was made. A way to map the ISBN number as a URN was described (URN:ISBN:0-395-36341-1 was mapped to a location by the PURL service, in this case at http://purl.org/urn/isbn/0-395-36341-1), which is quite cool and, in my opinion, both manageable and practical.


The author thought otherwise, however: But on the practical web, we don’t use PURLs or URNs for books, we use the Amazon.com url. I think in practical terms things are going to be represented on the web by the domain that has the best collection with the best open content.

Now, what's wrong about this? At first, it may seem reasonable that Amazon.com, indeed the domain with the (probably) largest collection of book titles, authors, and so on, should be used. Books are their business and they depend on offering as many titls as possible. In the everyday world, if you want to find a book, you look it up at Amazon.com. I do it and you do it, and the author does it. So what's wrong about it?

Well, Amazon.com does not provide persistent content per se, they provide a commercial service funded by whatever books they sell. At any time, they may decide to change the availability of a title, relocate its page, offer a later version of the same title, or even some other title altogether. The latter is unlikely, of course, but since we are talking about URLs, addresses, rather than URNs, names, talking about the URL when discussing what essentially is a name is about as relevant as talking about the worn bookshelf in my study when discussing the Chicago Manual of Style.

Yes, I realise that my example is a bit extreme, and I realise that it's easy enough to make the necessary assertions in RDF to properly reference something described by the address rather than the address itself, but to me, this highlights several key issues:
  • An address, by its very nature, is not persistent. Therefore, a "permanent URL" is to me a bit of an oxymoron. It's a contradiction in terms.
  • Even if we accept a "permanent URL approach", should we accept that the addresses are provided and controlled by a commercial entity? One of the reasons to why some of us advocate XML so vigorously is that it is open and owned by no-one. Yes, I know perfectly well that we always rely on commercial vendors for everything from editors to databases, but my point here is that we still own our data, the commercial vendors don't own it. I can take my data elsewhere.
  • Now, of course, in the world of metadata it's sensible to give a "see-also" link (indeed that is what Mr Cowan suggests), but the problem is that the "see-also" link is another URL with the same implicit problems as the primary URL.
  • URLs have a hard time addressing (yes, the pun is mostly intentional) the problem with versioning a document. How many times have you looked up a book at Amazon.com and found either the wrong version or a list of several versions, some of which even list the wrong book?
Of course, I'm as guilty as anyone because I do that, too. I point to exciting new books using a link to Amazon.com (actually I order my books from The Book Depository, mostly) because it's convenient. But if we discuss the principle rather than what we all do, it's (in my opinion) wrong to suggest that the practice is the best way to solve a problem that stems from addressing rather than naming. It's not a solution, it merely highlights the problem.

Tuesday, 13 April 2010

I Want A Nokia N900

I've been waiting to get my hands on a Nokia N900 smartphone for a couple of months now. Nokia released it in November or December (depending on who you choose to believe), and here in Sweden in January, but the phones have been in very short supply. I've been asking around but so far, there's been no sign of the N900, anywhere I shop. The other week I finally placed an order at The PhoneHouse. I was told that there are currently six (6) phones available for 114 stores, but that I could expect it in a week and a half or so. And if I didn't want it, the guy said he could sell it anyway...

The phone itself is a nerd's wet dream. It runs on Maemo, a Debian/GNU Linux-based distro (yes, it can run Debian apps even though the screen might be ill-suited for some of them), and is actually more of a computer with a built-in mobile rather than the other way around. People have successfully managed to get OpenOffice to run on it and so I'm thinking that I can probably make some kind of XML editor work on it.

A fellow XML'er in the UK has had the phone for months, now, and doesn't miss a chance to tell the world about it on Twitter. I'm jealous and I want one. Now.

Wednesday, 7 April 2010

Footnotes

Those familiar with my old schemas and DTDs will now probably raise an eyebrow, but I have finally succumbed to the lure of footnotes in the inline content model of my all-purpose personal DTD.

What finally convinced me was my need to create multiple references to a single note that, while interrupting the text flow and thus unwelcome in the text itself, was too short to place in a section of its own. There was no logical way to semantically identify that note in a form or in a place that would allow me to reference it from several different points in my text. Footnotes (and footnote references) solve that problem very neatly, and the allow me to present my footnotes as end notes using a different stylesheet.

Tuesday, 6 April 2010

Coffee

Coffee, as you all know, is the lifeblood of any office. Well, our coffee machine is dead and while I would have liked to say that it didn't suffer, the trail of dried-up coffee along the floor speaks the opposite.

Expect a slow day after the Eastern Holidays, here.