I held a presentation on addresses and naming of resources in document management at the Teknisk Dokumentation 2010 conference (Swedish-laguage link; sorry) last week. The conference now stands out among the ones I've attended lately due to the fact that there was a power outage during the afternoon of the first conference day (leading to some rather different presentations).
I also learned a lot. Olaf Drummer's presentation about the PDF/A format, especially the coming PDF/A-3 standard, gave me a few ideas that I intend to implement.
Monday, 29 November 2010
Friday, 1 October 2010
XML Prague 2011
XML Prague 2011 will take place on March 26th & 27th. I'm so going to be there.
Wednesday, 22 September 2010
DITA Specialization in DocBook?
Eliot Kimber and Norman Walsh apparently have discussed DITA, DocBook and specialization á la DITA in DocBook. Norman Walsh wrote a blog entry on it, and Eliot Kimber commented it.
Very interesting reading, at least if you are a markup geek (which I am). I don't think they've changed my opinions on DITA, however, even though I'm thinking about it.
Very interesting reading, at least if you are a markup geek (which I am). I don't think they've changed my opinions on DITA, however, even though I'm thinking about it.
Tuesday, 7 September 2010
Mobile Sync, Part Three
After (unsuccessfully) banging my head against the wall trying to sync my Ubuntu 10.04 laptop with the Nokia N900, I resorted to the only solution I knew would work.
I wiped out Ubuntu and installed Debian GNU/Linux Sid in its place. Apart from spending a night recovering from a dodgy dist-upgrade, the laptop now works, syncing perfectly with the N900.
Me, I think there is something wrong with Ubuntu 10.04.
I wiped out Ubuntu and installed Debian GNU/Linux Sid in its place. Apart from spending a night recovering from a dodgy dist-upgrade, the laptop now works, syncing perfectly with the N900.
Me, I think there is something wrong with Ubuntu 10.04.
Tuesday, 31 August 2010
AdSense and Spam
Gotta love AdSense. When checking my Gmail account's spam folder, I noticed that AdSense did its thing. Above the dozen or so Viagra and penis enlargement ads, AdSense had placed this:
Spam Skillet Casserole - Broil until golden
Wednesday, 18 August 2010
More XProc
I've been busy reading up on XProc today while walking through W3C's XProc Test Suite.
An XML pipeline language has been on my wish list ever since my friend Henrik Mårtensson wrote something called eXtensible Filter Objects (XFO), an XML pipeline language not unlike XProc, about ten years ago and then lost interest, focussing instead on lean theories, business management and such. Some time before he moved on he wrote a Perl implementation of XFO and another friend, David Rosell, wrote a Java version of that, but unfortunate circumstances killed it all after XFO had been implemented for a few of our then-clients at Information & Media.
XProc, of course, does more than XFO ever did, but the ideas are the same. XProc is scratching a persistent itch for me and might (IMO, of course) very well become one of XML's most important specs to date. For someone like me who is basically a non-programmer, being more of a markup theorist and dochead (to follow Ken Holman's labelling of the degrees of XML geekery), it's a wish come true.
Today, in spite of me going through the test suite and reading the spec, I feel that my most important action towards XProc wisdom was to check with Norman Walsh if he's working on an XProc book yet (he is).
I'm getting there, though. I hope to finish a working pipeline for Cassis TI publishing tomorrow.
An XML pipeline language has been on my wish list ever since my friend Henrik Mårtensson wrote something called eXtensible Filter Objects (XFO), an XML pipeline language not unlike XProc, about ten years ago and then lost interest, focussing instead on lean theories, business management and such. Some time before he moved on he wrote a Perl implementation of XFO and another friend, David Rosell, wrote a Java version of that, but unfortunate circumstances killed it all after XFO had been implemented for a few of our then-clients at Information & Media.
XProc, of course, does more than XFO ever did, but the ideas are the same. XProc is scratching a persistent itch for me and might (IMO, of course) very well become one of XML's most important specs to date. For someone like me who is basically a non-programmer, being more of a markup theorist and dochead (to follow Ken Holman's labelling of the degrees of XML geekery), it's a wish come true.
Today, in spite of me going through the test suite and reading the spec, I feel that my most important action towards XProc wisdom was to check with Norman Walsh if he's working on an XProc book yet (he is).
I'm getting there, though. I hope to finish a working pipeline for Cassis TI publishing tomorrow.
Monday, 16 August 2010
XProc
I'm going to spend the next week or two doing a test implementation of XProc for our document management system, Cassis TI. XProc, as some of you will know, is a pipeline processing language for XML processing, in the same vein as pipe processing in the *nix world. It's intended to standardise and ease XML processing by treating the processing as a black box consisting of smaller black boxes; in other words, what is inside is less interesting than how the in- and outputs are defined and used.
The test is about producing PDF output so it's nothing fancy or new, but it's important because I believe we can replace our current backend with an XProc-based processor, making things easier, faster and better for programmers and users alike.
The test is about producing PDF output so it's nothing fancy or new, but it's important because I believe we can replace our current backend with an XProc-based processor, making things easier, faster and better for programmers and users alike.
Friday, 13 August 2010
Mobile Sync, Part Two
I have an older IBM Thinkpad (a T42p) laptop with Ubuntu Studio installed. In version 9.10, syncevolution worked like a charm. All I had to do was to install, setup the N900 and sync, no problems whatsoever. Then I got brave and upgraded the laptop to Ubuntu 10.04 and syncevolution to the latest version.
Fail to sync.
And mind you, it doesn't tell me what's wrong, it just fails. I've tried installing older syncevolution packages, resetting bluetooth stuff, sacrificing my firstborn... nothing helps!
If you know what's wrong, please let me know.
Fail to sync.
And mind you, it doesn't tell me what's wrong, it just fails. I've tried installing older syncevolution packages, resetting bluetooth stuff, sacrificing my firstborn... nothing helps!
If you know what's wrong, please let me know.
Labels:
10.04,
9.10,
IBM Thinkpad,
N900,
Nokia,
syncevolution,
Ubuntu
Thursday, 15 July 2010
Roland D-50
Got my hands on this venerable synth. Seems I'll have to do repairs but oh, I'm so looking forward to this.
Tuesday, 29 June 2010
Mobile Sync
After years of not being able to sync my Nokia mobile(s) with my Debian Linux desktop, syncevolution and the Evolution "groupware suite" have finally made that possible. I've had success with both my older Symbian 60-based phone, N95-2, and my (Maemo-based) N900.
See www.syncevolution.org for details on how to do this. My Debian Sid box required the apt sources from that site (it seems that Sid is lagging behind, at least for now; they've packaged the last beta but the site includes the released 1.0 version), but otherwise the install and sync both went without a hitch.
See www.syncevolution.org for details on how to do this. My Debian Sid box required the apt sources from that site (it seems that Sid is lagging behind, at least for now; they've packaged the last beta but the site includes the released 1.0 version), but otherwise the install and sync both went without a hitch.
VirtualBox
I've switched from KVM to VirtualBox for my virtualisation needs. My Debian laptop is hosting and right now there is a Windows 7 guest. Apart from some slowness, especially with shared folders (on extfs3), the whole thing works like a charm. I can run XMetaL in the VirtualBox with no problems.
Finally it looks like I won't be needing a Windows partition at work.
Finally it looks like I won't be needing a Windows partition at work.
Monday, 14 June 2010
DITA Lists, Part Two
Today it occurred me to have a look at the DITA Architecture Specification source to see how the people behind the spec would tag a list; as some of you know, this was the subject of my recent blog entry. There are a number of lists in that spec, many with introductory paragraphs, so it's a pretty obvious way to find out, right? Well, after the examples in that spec, maybe.
Anyway, this is how they do it:
<p>Introductory para:
<ul>
<li>Item</li>
<li>Item</li>
</ul>
</p>
This was one of my guesses, and I have to say that it's better than any of the alternatives I could come up with. It's not good markup, though, in my opinion, as it says that semantically, a paragraph is sort of a block-level superclass, a do-it-all and one that you must use if you need that introduction.
But then, why limit yourself to lists? Why aren't notes tagged like that? Or definition lists, or images, or tables? Think about it. Doesn't this feel just a little bit like a cop-out to you? It does to me. It feels like the author realised that he needed that wrapper but there was nothing he could cling to, other than this construction.
I'm not saying that my way is the only way (obviously it's not) but this bothers me because it muddies the semantic waters.
Anyway, this is how they do it:
<p>Introductory para:
<ul>
<li>Item</li>
<li>Item</li>
</ul>
</p>
This was one of my guesses, and I have to say that it's better than any of the alternatives I could come up with. It's not good markup, though, in my opinion, as it says that semantically, a paragraph is sort of a block-level superclass, a do-it-all and one that you must use if you need that introduction.
But then, why limit yourself to lists? Why aren't notes tagged like that? Or definition lists, or images, or tables? Think about it. Doesn't this feel just a little bit like a cop-out to you? It does to me. It feels like the author realised that he needed that wrapper but there was nothing he could cling to, other than this construction.
I'm not saying that my way is the only way (obviously it's not) but this bothers me because it muddies the semantic waters.
Wednesday, 9 June 2010
List Modelling
I've been reading up on DITA. I've looked at the specs and the DTD before, obviously, but more from the perspective of an innocent bystander. The DTDs I implement in authoring systems and elsewhere are usually my own, and whenever I need to deliver content in some other format, I simply convert to it. This time things are a bit different, however, as we are considering doing a "DITA Edition" of the content management system I'm responsible for at work, and I need to know how DITA can fit into our stuff.
DITA's got lots of things that I like, such as the combining of topic IDs with target IDs in references to avoid ID collisions. The DITA way is a very elegant solution and probably a better one than what I would usually do, which is to (in various ways in the DTD and in the authoring environment) make sure that authors can never end up in situations like it to begin with. There's other stuff, too, but those are best left to another blog entry at some point.
Here, I want to talk about list modelling and specifically something that not only DITA but so many other DTDs and schemas seem to ignore, and that, in my mind, results in bad markup. Let's start by discussing list semantics first:
A list is, well, a list of things. There are several types of lists, of which unordered and ordered are the most common, and the semantics are probably clear enough: the former lists stuff without a specific order (say, grocery lists) and the latter items whose order is significant (for example, David Letterman's top ten lists). There's also the definition list (which, in my mind, is not a list at all but a special case of a table, namely a two-column one), and probably some other types as well. In DITA, you can find something called "simple list", which claims to limit what's listed to one line per item, tops, without bullets or numbers, but to me that's less about semantics and more about presentation.
So here's a typical DITA list (HTML, DocBook and quite a few others look exactly like it, too):
<ul>
<li>Apples</li>
<li>Oranges</li>
<li>Bananas</li>
</ul>
There's more to list semantics, though, at least in my mind. If you wanted to find a complete list in a document, you'd probably want to include its qualifying introduction ("Here's the groceries you need to buy:"), and any and all information that goes between list items without being part of them but still belonging to the list as a whole. If your spouse is kind enough to subcategorise the grocery list to vegetables, fruit, dairy products and so on (I know I need the help), we'd have a multi-part list where the participating lists are part of a larger whole.
The introductory paragraph is where it gets tricky in DITA and similar structures. There are a LOT of block-level elements to choose from, but you cannot easily do a list that meets these requirements. This one, the preferred DITA way (at least if we choose to believe the examples in the spec), lacks a wrapper that identifies the list as one unit instead of a loose paragraph that happens to be followed by a list:
<p>The fruit we need for tonight:</p>
<ul>
<li>Apples</li>
<li>Oranges</li>
<li>Bananas</li>
</ul>
<p>And the vegetables for tomorrow:</p>
<ul>
<li>Cucumbers</li>
<li>Tomatoes</li>
</ul>
Of course, one could argue that our grocery list is really a section, but I would argue that the introductory paragraph is actually part of the list, but not necessarily a part of the whole section. What if I wanted to include images or perhaps a note to that section? Semantically, I can think of dozens of ways to reasonably expand the structure of such a surrounding section and still keep it on topic (that is, limiting it to subject matters concerning that central grocery list).
Keeping with DITA's topic-based approach, we could certainly use a number of such sections and wrap the whole thing in a topic, but me, I think that's overkill. All I want to do is include an introductory paragraph.
This, of course, is where some will argue that the introductory paragraph is really a heading. Definition lists in DITA and some other DTDs actually do have a heading for this very purpose, which to me hints that somebody did touch the subject at hand at some point, but then why do the "ordinary" lists without that heading? And of course, me, I think that introduction is not a heading at all, only a qualifier for the list.
Another option in DITA and others is to use the <p> element as a wrapper:
<p>The fruit we need for tonight:
<ul>
<li>Apples</li>
<li>Oranges</li>
<li>Bananas</li>
</ul>
And the vegetables for tomorrow:
<ul>
<li>Cucumbers</li>
<li>Tomatoes</li>
</ul>
</p>
This is perfectly valid, of course, but it ruins the intent of the <p> element and creates a very odd (and ugly) mixed content that would be difficult to process properly.
What I would like to see is more in the lines of this:
<ul>
<p>The fruit we need for tonight:</p>
<li>Apples</li>
<li>Oranges</li>
<li>Bananas</li>
<p>And the vegetables for tomorrow:</p>
<li>Cucumbers</li>
<li>Tomatoes</li>
</ul>
Now we have a single list (our grocery list) that includes the necessary introduction(s). Of course, it's still somewhat ugly; I, for one, dislike the relative lack of list item structure--I'd much rather see an item modelled more properly, perhaps divided into paragraphs and other block-level content, where the concepts block level and inline remain properly separated.
DITA's got lots of things that I like, such as the combining of topic IDs with target IDs in references to avoid ID collisions. The DITA way is a very elegant solution and probably a better one than what I would usually do, which is to (in various ways in the DTD and in the authoring environment) make sure that authors can never end up in situations like it to begin with. There's other stuff, too, but those are best left to another blog entry at some point.
Here, I want to talk about list modelling and specifically something that not only DITA but so many other DTDs and schemas seem to ignore, and that, in my mind, results in bad markup. Let's start by discussing list semantics first:
A list is, well, a list of things. There are several types of lists, of which unordered and ordered are the most common, and the semantics are probably clear enough: the former lists stuff without a specific order (say, grocery lists) and the latter items whose order is significant (for example, David Letterman's top ten lists). There's also the definition list (which, in my mind, is not a list at all but a special case of a table, namely a two-column one), and probably some other types as well. In DITA, you can find something called "simple list", which claims to limit what's listed to one line per item, tops, without bullets or numbers, but to me that's less about semantics and more about presentation.
So here's a typical DITA list (HTML, DocBook and quite a few others look exactly like it, too):
<ul>
<li>Apples</li>
<li>Oranges</li>
<li>Bananas</li>
</ul>
There's more to list semantics, though, at least in my mind. If you wanted to find a complete list in a document, you'd probably want to include its qualifying introduction ("Here's the groceries you need to buy:"), and any and all information that goes between list items without being part of them but still belonging to the list as a whole. If your spouse is kind enough to subcategorise the grocery list to vegetables, fruit, dairy products and so on (I know I need the help), we'd have a multi-part list where the participating lists are part of a larger whole.
The introductory paragraph is where it gets tricky in DITA and similar structures. There are a LOT of block-level elements to choose from, but you cannot easily do a list that meets these requirements. This one, the preferred DITA way (at least if we choose to believe the examples in the spec), lacks a wrapper that identifies the list as one unit instead of a loose paragraph that happens to be followed by a list:
<p>The fruit we need for tonight:</p>
<ul>
<li>Apples</li>
<li>Oranges</li>
<li>Bananas</li>
</ul>
<p>And the vegetables for tomorrow:</p>
<ul>
<li>Cucumbers</li>
<li>Tomatoes</li>
</ul>
Of course, one could argue that our grocery list is really a section, but I would argue that the introductory paragraph is actually part of the list, but not necessarily a part of the whole section. What if I wanted to include images or perhaps a note to that section? Semantically, I can think of dozens of ways to reasonably expand the structure of such a surrounding section and still keep it on topic (that is, limiting it to subject matters concerning that central grocery list).
Keeping with DITA's topic-based approach, we could certainly use a number of such sections and wrap the whole thing in a topic, but me, I think that's overkill. All I want to do is include an introductory paragraph.
This, of course, is where some will argue that the introductory paragraph is really a heading. Definition lists in DITA and some other DTDs actually do have a heading for this very purpose, which to me hints that somebody did touch the subject at hand at some point, but then why do the "ordinary" lists without that heading? And of course, me, I think that introduction is not a heading at all, only a qualifier for the list.
Another option in DITA and others is to use the <p> element as a wrapper:
<p>The fruit we need for tonight:
<ul>
<li>Apples</li>
<li>Oranges</li>
<li>Bananas</li>
</ul>
And the vegetables for tomorrow:
<ul>
<li>Cucumbers</li>
<li>Tomatoes</li>
</ul>
</p>
This is perfectly valid, of course, but it ruins the intent of the <p> element and creates a very odd (and ugly) mixed content that would be difficult to process properly.
What I would like to see is more in the lines of this:
<ul>
<p>The fruit we need for tonight:</p>
<li>Apples</li>
<li>Oranges</li>
<li>Bananas</li>
<p>And the vegetables for tomorrow:</p>
<li>Cucumbers</li>
<li>Tomatoes</li>
</ul>
Now we have a single list (our grocery list) that includes the necessary introduction(s). Of course, it's still somewhat ugly; I, for one, dislike the relative lack of list item structure--I'd much rather see an item modelled more properly, perhaps divided into paragraphs and other block-level content, where the concepts block level and inline remain properly separated.
Monday, 7 June 2010
4G?
Apple unveiled their newest iPhone, the 4G, earlier today. Looks like they are getting a little closer to what my Nokia N900 can do. I have to admit that they know how to market their stuff. If the functionality matched the hype, I might even be interested.
Tuesday, 25 May 2010
N900 Gets New Firmware
Those of us owning a Nokia N900 have been patiently (well, some of us, in any case) waiting for the new firmware, PR 1.2. It's been delayed a couple of times but now maemo.org informs us that tomorrow the wait is finally over.
Among the goodies are a new QT library, more apps, a revamped UI, and quite a list of bug fixes.
Among the goodies are a new QT library, more apps, a revamped UI, and quite a list of bug fixes.
Monday, 24 May 2010
Bye Bye Andrew Wakefield
Finally.
http://news.yahoo.com/s/ap/20100524/ap_on_sc/eu_britain_autism_doctor
http://news.yahoo.com/s/ap/20100524/ap_on_sc/eu_britain_autism_doctor
Wednesday, 12 May 2010
Blogging from My N900
This is just to test blogging using my N900 and MaStory. It seems to work.
Friday, 16 April 2010
Permanent URLs, Addresses and Names
I found a link to an article by Taylor Cowan about persistent URLs on the web. It was mostly about what happens to metadata assertions (such as RDF statements) when links break, but there was a little something on persistent links and URNs, too. A comparison with Amazon.com and how books are referenced these days was made. A way to map the ISBN number as a URN was described (URN:ISBN:0-395-36341-1 was mapped to a location by the PURL service, in this case at http://purl.org/urn/isbn/0-395-36341-1), which is quite cool and, in my opinion, both manageable and practical.
The author thought otherwise, however:
Now, what's wrong about this? At first, it may seem reasonable that Amazon.com, indeed the domain with the (probably) largest collection of book titles, authors, and so on, should be used. Books are their business and they depend on offering as many titls as possible. In the everyday world, if you want to find a book, you look it up at Amazon.com. I do it and you do it, and the author does it. So what's wrong about it?
Well, Amazon.com does not provide persistent content per se, they provide a commercial service funded by whatever books they sell. At any time, they may decide to change the availability of a title, relocate its page, offer a later version of the same title, or even some other title altogether. The latter is unlikely, of course, but since we are talking about URLs, addresses, rather than URNs, names, talking about the URL when discussing what essentially is a name is about as relevant as talking about the worn bookshelf in my study when discussing the Chicago Manual of Style.
Yes, I realise that my example is a bit extreme, and I realise that it's easy enough to make the necessary assertions in RDF to properly reference something described by the address rather than the address itself, but to me, this highlights several key issues:
The author thought otherwise, however:
But on the practical web, we don’t use PURLs or URNs for books, we use the Amazon.com url. I think in practical terms things are going to be represented on the web by the domain that has the best collection with the best open content.
Now, what's wrong about this? At first, it may seem reasonable that Amazon.com, indeed the domain with the (probably) largest collection of book titles, authors, and so on, should be used. Books are their business and they depend on offering as many titls as possible. In the everyday world, if you want to find a book, you look it up at Amazon.com. I do it and you do it, and the author does it. So what's wrong about it?
Well, Amazon.com does not provide persistent content per se, they provide a commercial service funded by whatever books they sell. At any time, they may decide to change the availability of a title, relocate its page, offer a later version of the same title, or even some other title altogether. The latter is unlikely, of course, but since we are talking about URLs, addresses, rather than URNs, names, talking about the URL when discussing what essentially is a name is about as relevant as talking about the worn bookshelf in my study when discussing the Chicago Manual of Style.
Yes, I realise that my example is a bit extreme, and I realise that it's easy enough to make the necessary assertions in RDF to properly reference something described by the address rather than the address itself, but to me, this highlights several key issues:
- An address, by its very nature, is not persistent. Therefore, a "permanent URL" is to me a bit of an oxymoron. It's a contradiction in terms.
- Even if we accept a "permanent URL approach", should we accept that the addresses are provided and controlled by a commercial entity? One of the reasons to why some of us advocate XML so vigorously is that it is open and owned by no-one. Yes, I know perfectly well that we always rely on commercial vendors for everything from editors to databases, but my point here is that we still own our data, the commercial vendors don't own it. I can take my data elsewhere.
- Now, of course, in the world of metadata it's sensible to give a "see-also" link (indeed that is what Mr Cowan suggests), but the problem is that the "see-also" link is another URL with the same implicit problems as the primary URL.
- URLs have a hard time addressing (yes, the pun is mostly intentional) the problem with versioning a document. How many times have you looked up a book at Amazon.com and found either the wrong version or a list of several versions, some of which even list the wrong book?
Tuesday, 13 April 2010
I Want A Nokia N900
I've been waiting to get my hands on a Nokia N900 smartphone for a couple of months now. Nokia released it in November or December (depending on who you choose to believe), and here in Sweden in January, but the phones have been in very short supply. I've been asking around but so far, there's been no sign of the N900, anywhere I shop. The other week I finally placed an order at The PhoneHouse. I was told that there are currently six (6) phones available for 114 stores, but that I could expect it in a week and a half or so. And if I didn't want it, the guy said he could sell it anyway...
The phone itself is a nerd's wet dream. It runs on Maemo, a Debian/GNU Linux-based distro (yes, it can run Debian apps even though the screen might be ill-suited for some of them), and is actually more of a computer with a built-in mobile rather than the other way around. People have successfully managed to get OpenOffice to run on it and so I'm thinking that I can probably make some kind of XML editor work on it.
A fellow XML'er in the UK has had the phone for months, now, and doesn't miss a chance to tell the world about it on Twitter. I'm jealous and I want one. Now.
The phone itself is a nerd's wet dream. It runs on Maemo, a Debian/GNU Linux-based distro (yes, it can run Debian apps even though the screen might be ill-suited for some of them), and is actually more of a computer with a built-in mobile rather than the other way around. People have successfully managed to get OpenOffice to run on it and so I'm thinking that I can probably make some kind of XML editor work on it.
A fellow XML'er in the UK has had the phone for months, now, and doesn't miss a chance to tell the world about it on Twitter. I'm jealous and I want one. Now.
Wednesday, 7 April 2010
Footnotes
Those familiar with my old schemas and DTDs will now probably raise an eyebrow, but I have finally succumbed to the lure of footnotes in the inline content model of my all-purpose personal DTD.
What finally convinced me was my need to create multiple references to a single note that, while interrupting the text flow and thus unwelcome in the text itself, was too short to place in a section of its own. There was no logical way to semantically identify that note in a form or in a place that would allow me to reference it from several different points in my text. Footnotes (and footnote references) solve that problem very neatly, and the allow me to present my footnotes as end notes using a different stylesheet.
What finally convinced me was my need to create multiple references to a single note that, while interrupting the text flow and thus unwelcome in the text itself, was too short to place in a section of its own. There was no logical way to semantically identify that note in a form or in a place that would allow me to reference it from several different points in my text. Footnotes (and footnote references) solve that problem very neatly, and the allow me to present my footnotes as end notes using a different stylesheet.
Tuesday, 6 April 2010
Coffee
Coffee, as you all know, is the lifeblood of any office. Well, our coffee machine is dead and while I would have liked to say that it didn't suffer, the trail of dried-up coffee along the floor speaks the opposite.
Expect a slow day after the Eastern Holidays, here.
Expect a slow day after the Eastern Holidays, here.
Tuesday, 23 March 2010
Friday, 19 March 2010
XML for the Long Haul
There will be a one-day symposium on the theme XML for the Long Haul, right before the Balisage conference in Montréal this year. I've thought about this, lately.
First of all, isn't this what XML is about? The ability for information to survive a proprietary method of conserving it? The means to make it happen, regardless of what happens to your software? I've preached about this for a long time for my customers, listeners, and those who just couldn't get away. If a disaster happened to your software, if it was somehow wiped out in spite of your best efforts, my point was that it would only take a few days to build something that would parse most of the information in an XML file. Maybe another few days to produce output from it, but provided that you spoke the written language and the structure was done by someone who had at least a basic idea of what XML (and SGML; this isn't new) was about, it wouldn't take more than a few days at most to see what that lost information was about.
Second, my points re the first, above, pretty much summarise my views here, but I really mean it: This is what XML is about.
But is it really that simple? Is markup really that descriptive? Well, not always. There's plenty of markup out there that is obscure and hard to read. For example, is a namespace going to make your leftover instances easier to read? Are your element type names descriptive? What about your attributes? Do you include comments or annotations with your schema? Do you include wrappers that contain groups of element types in a semantically meaningful way? Does your group include everything required for that group to be complete? Have a look at one of your instances with fresh eyes, see if it makes sense. Does one type of information relate to another? How would you format this lost instance, if you had just come across it? If it had been a thousand years and you could understand the language but not the culture, would you understand the meaning of the information? Could you print it and explain what went on then?
Don't laugh. Pretend that you really are viewing your structures from the outside. Pretend that you don't have the schema at hand. Pretend that you don't know the semantics, even though you can understand the contents. Pretend that you really are studying the information as an outsider. Does it all make sense?
I think this is a worthwhile reality check. I think that we all should ask this of the schemas we create, every time we do an information analysis. Are our schemas understandable? Are they legible?
I would really like to be in Montréal in August this year. I think it's important.
First of all, isn't this what XML is about? The ability for information to survive a proprietary method of conserving it? The means to make it happen, regardless of what happens to your software? I've preached about this for a long time for my customers, listeners, and those who just couldn't get away. If a disaster happened to your software, if it was somehow wiped out in spite of your best efforts, my point was that it would only take a few days to build something that would parse most of the information in an XML file. Maybe another few days to produce output from it, but provided that you spoke the written language and the structure was done by someone who had at least a basic idea of what XML (and SGML; this isn't new) was about, it wouldn't take more than a few days at most to see what that lost information was about.
Second, my points re the first, above, pretty much summarise my views here, but I really mean it: This is what XML is about.
But is it really that simple? Is markup really that descriptive? Well, not always. There's plenty of markup out there that is obscure and hard to read. For example, is a namespace going to make your leftover instances easier to read? Are your element type names descriptive? What about your attributes? Do you include comments or annotations with your schema? Do you include wrappers that contain groups of element types in a semantically meaningful way? Does your group include everything required for that group to be complete? Have a look at one of your instances with fresh eyes, see if it makes sense. Does one type of information relate to another? How would you format this lost instance, if you had just come across it? If it had been a thousand years and you could understand the language but not the culture, would you understand the meaning of the information? Could you print it and explain what went on then?
Don't laugh. Pretend that you really are viewing your structures from the outside. Pretend that you don't have the schema at hand. Pretend that you don't know the semantics, even though you can understand the contents. Pretend that you really are studying the information as an outsider. Does it all make sense?
I think this is a worthwhile reality check. I think that we all should ask this of the schemas we create, every time we do an information analysis. Are our schemas understandable? Are they legible?
I would really like to be in Montréal in August this year. I think it's important.
Tuesday, 16 March 2010
Back from XML Prague
I'm back home from XML Prague. It's been a fabulous weekend with many interesting talks and lots of good ideas, and I'm still trying to sort my impressions. So many things I want to try, so many technologies I want to learn. The feedback from my talk on Film Markup Language alone is enough to keep me busy for a few weeks.
More later, but for now, suffice to say that I'm already thinking of a subject for a presentation next year.
More later, but for now, suffice to say that I'm already thinking of a subject for a presentation next year.
Friday, 12 March 2010
It's Quite Possible to Lose Your Way in Prague
I drove to Prague for XML Prague, yesterday. I left Göteborg on Wednesday evening, taking the ferry to Kiel, and then spent most of Thursday on the Autobahn. It all went without a hitch; not that I'm that good but my GPS is. I would probably have ended up in Poland without it because I often miss the road signs when on my own. Some of my business trips before the GPS era were truly memorable.
So today I took a walk around central Prague, shopping gifts and seeing the sights. And a wonderful city it is, one of my favourite cities in Europe. All that history, all that architecture, the bridges... and small, narrow streets that are never straight. They are practically organic (and probably feed from the gift shops since they are everywhere), and it's very difficult to find your way. It's a labyrinth we are talking about.
Yes, I lost my way. The third time I came back to that innocent-looking Kodak shop (and there are a lot of shops with Kodak signs in central Prague, I might add), I knew I was in trouble. I was walking in circles, my feet aching while a particularly wet mixture of snow and rain poured down, and had no idea where I was. And I kept thinking about my GPS, safely tucked away back in my hotel room, remembering that I actually considered bringing it along for the walk but then shrugging, thinking "how hard can it be?"
I found a shelter in a mall I hadn't seen before (well, I think I hadn't seen it before) and considered my next move while high-heeled ladies tried lipsticks and wondered what the out-of-place stranger was doing in the cosmetics department. I could ask someone, I suppose, some friendly local...
Then I remembered: I have a GPS in my mobile. It took a few minutes for it to find the satellites it required but after that, I only had to walk for a few more minutes to find a familiar landmark. In a counter-intuitive direction, I might add.
The wisdom in this story? Thank goodness for GPS devices. Oh, and XML Prague starts tomorrow morning.
So today I took a walk around central Prague, shopping gifts and seeing the sights. And a wonderful city it is, one of my favourite cities in Europe. All that history, all that architecture, the bridges... and small, narrow streets that are never straight. They are practically organic (and probably feed from the gift shops since they are everywhere), and it's very difficult to find your way. It's a labyrinth we are talking about.
Yes, I lost my way. The third time I came back to that innocent-looking Kodak shop (and there are a lot of shops with Kodak signs in central Prague, I might add), I knew I was in trouble. I was walking in circles, my feet aching while a particularly wet mixture of snow and rain poured down, and had no idea where I was. And I kept thinking about my GPS, safely tucked away back in my hotel room, remembering that I actually considered bringing it along for the walk but then shrugging, thinking "how hard can it be?"
I found a shelter in a mall I hadn't seen before (well, I think I hadn't seen it before) and considered my next move while high-heeled ladies tried lipsticks and wondered what the out-of-place stranger was doing in the cosmetics department. I could ask someone, I suppose, some friendly local...
Then I remembered: I have a GPS in my mobile. It took a few minutes for it to find the satellites it required but after that, I only had to walk for a few more minutes to find a familiar landmark. In a counter-intuitive direction, I might add.
The wisdom in this story? Thank goodness for GPS devices. Oh, and XML Prague starts tomorrow morning.
Tuesday, 2 March 2010
Automating Cinemas at XML Prague
I've been busy writing my presentation and some example XML documents for my presentation on Automating Cinemas Using XML at XML Prague in about a week and a half. I'm slightly biased, I know, but I think the presentation actually does make a good case for XML-based automation of cinemas. I know how primitive today's automation is, in spite of the many technological advances, and I know where to improve it. The question I'm pondering right now is how to explain the key points to a bunch of XML people who've probably never seen a projection booth, and do it in twenty minutes.
The opposite holds true, of course, if I ever want to sell my ideas to theatre owners. They know enough about the technology (I hope) but how on earth will I be able to explain what XML is?
There's still have stuff to do (for one, it would be nice to finish the XSLT conversions required and be able to demonstrate those, live, at the conference) but the presentation itself is practically finished and the DTD and example documents are coming along nicely. I suppose I need to update the whitepaper accordingly and publish it here, when I'm done.
See you at XML Prague!
The opposite holds true, of course, if I ever want to sell my ideas to theatre owners. They know enough about the technology (I hope) but how on earth will I be able to explain what XML is?
There's still have stuff to do (for one, it would be nice to finish the XSLT conversions required and be able to demonstrate those, live, at the conference) but the presentation itself is practically finished and the DTD and example documents are coming along nicely. I suppose I need to update the whitepaper accordingly and publish it here, when I'm done.
See you at XML Prague!
Wednesday, 24 February 2010
Developing SGML DTDs: From Text To Model To Markup
Quite by accident, I discovered that Eve Maler and Jeanne El Andaloussi's Developing SGML DTDs: From Text To Model To Markup is available online. I'm one of the people lucky enough to own a hard copy, but if you aren't as fortunate, read it at http://www.xmlgrrl.com/publications/DSDTD/. It's one of the best books ever written about information analysis, that (far too) little used skill required to write a good DTD. In my ever-so humble opinion, the book should be mandatory for anyone involved in a markup-related project of any kind, that's how good it is.
(Yes, I know it was written before XML came out, 12 years ago, but XML is SGML, really, and the book remains as useful today as it was when it came out in 1995.S
(Yes, I know it was written before XML came out, 12 years ago, but XML is SGML, really, and the book remains as useful today as it was when it came out in 1995.S
Friday, 19 February 2010
Tiny URLs
I don't like them. Tiny URLs, that is. Those short things that look like web addresses (they are!) but give no clue to what their targets are. They have become commonplace enough now, though, and it's time to react.
I really don't like them.
Here's why: They look just like those little URLs that used to be well hidden in seemingly legitimate spam emails. Every time I see them, my first thought is spam. If I follow that link, someone will exploit a weakness in my browser to gain control over my machine or empty my credit card, somehow. And yes, I know, it won't really happen but I've lived with spam for a long time and I don't trust anything that cannot be deciphered simply by looking at it. I'm a bit silly in that respect. Yes, I realise there are benefits with using short URLs when tweeting, when your available space in counted in characters, but that's another instinctive dislike of mine: What's the point of messages forced to be short in such an arbitrary manner?
Yes, I use Twitter myself (mostly to keep track of stuff such as my favourite XML conference, XML Prague) and I fully understand the need of short URLs in tweets. You don't really want to waste the available space with a URL, if at all possible. It's a neat way of solving a problem, but a problem that is extremely artificial to begin with, to make room for other characters in an arbitrarily limited message.
But above all, I don't trust tiny URLs because I can't see what they are about. They are just characters preceded by "http://" and they look every bit as sneaky as that link you just know will break your machine.
I really don't like them.
Here's why: They look just like those little URLs that used to be well hidden in seemingly legitimate spam emails. Every time I see them, my first thought is spam. If I follow that link, someone will exploit a weakness in my browser to gain control over my machine or empty my credit card, somehow. And yes, I know, it won't really happen but I've lived with spam for a long time and I don't trust anything that cannot be deciphered simply by looking at it. I'm a bit silly in that respect. Yes, I realise there are benefits with using short URLs when tweeting, when your available space in counted in characters, but that's another instinctive dislike of mine: What's the point of messages forced to be short in such an arbitrary manner?
Yes, I use Twitter myself (mostly to keep track of stuff such as my favourite XML conference, XML Prague) and I fully understand the need of short URLs in tweets. You don't really want to waste the available space with a URL, if at all possible. It's a neat way of solving a problem, but a problem that is extremely artificial to begin with, to make room for other characters in an arbitrarily limited message.
But above all, I don't trust tiny URLs because I can't see what they are about. They are just characters preceded by "http://" and they look every bit as sneaky as that link you just know will break your machine.
Tuesday, 9 February 2010
Festival Impressions
The Göteborg Film Festival is over and life is slowly returning to normal. As usual I've worked at the festival as a projectionist (my 21st consecutive year at the Draken Cinema), screening films day and night, and the first few days after each festival are always a blur. First of all, I've had way too little sleep so my brain is not working at full speed. Second, the festival itself imposes a mental and physical routine that takes a few days to break. A day at the festival is divided into shows starting at certain times so everything I do is based on these fixed points in time; when I eat, when I have coffee, when I do anything but the screening itself.
And I'm not there yet. The last show was at 9 p.m. last night and mentally I'm still in the projection booth. I have still to say more than "hi" to my family, and I have no idea of what's been going on in the outside world, other than what I've learned through the Internet.
I expect the same to be true for many of my colleagues and probably quite a few festival visitors. The difference between me and most of them is that I don't watch films, I just screen them. The vast majority of the others visit and work at the festival because they love watching films. They see several of those every day, for 11 straight days, and then discuss them between themselves, finding new angles, new interpretations.
And sometimes they ask me about the films. Did I see anything good? Was the festival a success? Was this or that actor in film xyz? Etc. And I always tell them that I have no idea, that I didn't see a single film, that I don't care about what I show, just that it's shown as well as possible. I'm not there for the films, I'm there for the projection. It's a film projection marathon and I like the challenge. And every time, they are mystified. They look at me in disbelief, wondering why, wondering how I can spend 11 days in a dark projection booth, screening 60 shows without being interested in what I show.
It's the work itself, people. It's the technology, the projectors and the sound systems, but it's also the art, the show itself, with curtains and lights and magic; and it's the craftsmanship, inspecting film prints and handling the various requirements that together result in a successful show.
I explain this to people and they nod as if finally understanding... until the next time around, the next year and the next festival.
So yes, there might have been a few good films this year but I don't know that, and I really don't care. Was the festival successful? Yes, my screenings went well, all of them.
See you next year.
And I'm not there yet. The last show was at 9 p.m. last night and mentally I'm still in the projection booth. I have still to say more than "hi" to my family, and I have no idea of what's been going on in the outside world, other than what I've learned through the Internet.
I expect the same to be true for many of my colleagues and probably quite a few festival visitors. The difference between me and most of them is that I don't watch films, I just screen them. The vast majority of the others visit and work at the festival because they love watching films. They see several of those every day, for 11 straight days, and then discuss them between themselves, finding new angles, new interpretations.
And sometimes they ask me about the films. Did I see anything good? Was the festival a success? Was this or that actor in film xyz? Etc. And I always tell them that I have no idea, that I didn't see a single film, that I don't care about what I show, just that it's shown as well as possible. I'm not there for the films, I'm there for the projection. It's a film projection marathon and I like the challenge. And every time, they are mystified. They look at me in disbelief, wondering why, wondering how I can spend 11 days in a dark projection booth, screening 60 shows without being interested in what I show.
It's the work itself, people. It's the technology, the projectors and the sound systems, but it's also the art, the show itself, with curtains and lights and magic; and it's the craftsmanship, inspecting film prints and handling the various requirements that together result in a successful show.
I explain this to people and they nod as if finally understanding... until the next time around, the next year and the next festival.
So yes, there might have been a few good films this year but I don't know that, and I really don't care. Was the festival successful? Yes, my screenings went well, all of them.
See you next year.
Thursday, 4 February 2010
Five Days and Counting
The film festival is past the halfway mark now. And yes, I'm counting the hours.
Wednesday, 27 January 2010
Two Days Left To...
...Göteborg Film Festival and I'm already wishing for it to be over and done with.
Tuesday, 26 January 2010
Indexing Functionality in FOP
Anyone reading this who happens to be involved in the development of FOP, Apache's open source XSL-FO engine? If I ask you really nicely and politely, would you please consider implementing XSL-FO 1.1 index handling?
Alternatively, can you recommend a FO engine that is capable of index handling but costs less than RenderX's XEP or Antenna House's XSL Formatter?
Alternatively, can you recommend a FO engine that is capable of index handling but costs less than RenderX's XEP or Antenna House's XSL Formatter?
Visual Studio and XMetaL
I'm doing an XMetaL-based authoring environment based on scripts and stuff from earlier projects. I already have the CSS and I have most of the macros. All I need is a rules file, that is, XMetaL's compiled DTD file for the documents I need to write using this new environment, a few customisations, and a toolbar. For this I need to install 3.6 Gigabytes of Visual Studio .Net and XMetaL Developer. Is it just me or does any of you reading this agree with me that this is like taking an eighteen-wheeler to buy groceries? I know, I've ranted about this before, but it still amazes me that the XMetaL developers can allow this madness to continue.
C'mon, JustSystems, give us a way to customize XMetaL without having to buy Visual Studio. Give us what we had before XMetaL 4 and the misguided Corel deal to shut out other platforms. It doesn't have to be like this.
C'mon, JustSystems, give us a way to customize XMetaL without having to buy Visual Studio. Give us what we had before XMetaL 4 and the misguided Corel deal to shut out other platforms. It doesn't have to be like this.
Saturday, 23 January 2010
The Göteborg International Film Festival...
...is now less than a week away. I don't care about what they say about the Stockholm equivalent; ours is still Scandinavia's largest and if you care about film, you should attend.
Thursday, 21 January 2010
Friday, 15 January 2010
elementNames and attributeNames
I keep getting annoyed by the (Java-inspired) naming of elements and attributes in some people's XML, where the names contain capital letters to help keep the names clear. I'm sure you've seen how it works: elementName, attributeName, myNewAndExcitingElement, ohLookICanCreateReallyLongQNamesForNoApparentReason, ad nauseam.
Why do they do this? I know there is some kind of rationalisation for it in the world of programming languages, but in XML? XML is not a programming language and I still think it should be understandable and usable by humans (I know; SGML was supposed to be human-readable but XML doesn't have that requirement). If you find yourself writing XML in a text editor (still happens to me), not only are these names enough to drive anyone nuts but they also make the XML more error-prone because you're bound to spl something wrong. And if you write your XML in an XML editor, the element names filling the start and end tag symbols take up a lot of space that should be left to the actual content. (And no, I don't believe in the minimal tag symbols that some editors provide; I want to actually see the tag names and I want to see the attribute names. They help me structure my document; in fact, they are there for that purpose!)
I ask again: why? If you are writing a schema and need to name an ordinary paragraph element, surely you don't need to name it ordinaryParagraph or even paragraph? In my schemas, p is more than enough.
SoPleaseUseShorterNamesWithoutResortingToSillyConventionsBorrowedFromElsewhere.
Wednesday, 13 January 2010
Göteborg Film Festival
For 11 days every year, I take time off XML and the IT business to show films at the Göteborg Film Festival. I've been involved in the festival since 1987 and showing films at the Draken Cinema (for the festival; I've worked at the place for longer than that in other contexts) since 1990.
In just over two weeks, it's time for my 21st consecutive festival at the Draken.
In just over two weeks, it's time for my 21st consecutive festival at the Draken.
Monday, 11 January 2010
XML Prague 2010
I'm proud to inform you that my little something on Film Markup Language has been accepted at XML Prague. The conference will take place on March 13-14.
Friday, 8 January 2010
Finally, a new Intel Xorg driver in Debian Sid!
As most Intel video card users on Linux will know, the Xorg drivers have regressed significantly during the last year or so. From a reasonably stable driver with (mostly) expected performance and functionality, we've become accustomed to, well, a mess. For every bug fix, something new seems to break and I for one have become increasingly reluctant to upgrade unless I have to.
This time I really had to.
The new driver does seem to take care of the disappearing mouse pointer bug where any resolution higher than 1024x768 would make the pointer vanish. I had hopes it would also be able to recognise the correct resolution for my laptop when it is docked to an external screen (which the stable driver does without a problem) but no such luck.
Performance is still slow, too. The extra bells and whistles on KDE 4.3 just aren't possible if you want a desktop you can work with. I don't think they are that heavy on the system, it's just that the Intel driver sucks.
Still, for the first time in months, the new driver means an actual improvement.
This time I really had to.
The new driver does seem to take care of the disappearing mouse pointer bug where any resolution higher than 1024x768 would make the pointer vanish. I had hopes it would also be able to recognise the correct resolution for my laptop when it is docked to an external screen (which the stable driver does without a problem) but no such luck.
Performance is still slow, too. The extra bells and whistles on KDE 4.3 just aren't possible if you want a desktop you can work with. I don't think they are that heavy on the system, it's just that the Intel driver sucks.
Still, for the first time in months, the new driver means an actual improvement.
Tuesday, 5 January 2010
Words in Boxes
This is the day for reading other people's blogs. Dave Pawson's XProc tutorial indirectly pointed me to James Sulak's blog, Words in Boxes. A lot of it is about XML-related stuff but I also found gems such as his rant on grammar.
Recommended reading.
Recommended reading.
An XProc Tutorial
Dave Pawson has written an XProc Tutorial, with contributions from James Fuller, James Sulak and Norman Walsh. If you need to do step-by-step XML processing in your application and haven't yet heard of XProc, follow that link, now.
Friday, 1 January 2010
Subscribe to:
Posts (Atom)