Friday, 28 September 2012

Topic-based

I recently held a two-day workshop on topic-based information for a client faced with moving from paper-based documentation to multiple outputs in multiple media, especially in "smartphones". Now, before drawing conclusions, you should know that this particular client does have a reasonably mature process supported by a reasonably mature system. They already produce in XML, they translate their content to multiple languages, and they already publish automatically.

Their information is very much "book-oriented", however. It's sequential and it has interdependencies all over the documentation.

They were suggested "topic-based information" as means to an end, and my task, therefore, became to educate them about what is meant by topic-based information, what the intended advantages and frequent challenges are, what standards there are out there to support the concepts, and how it alls relates to their situation today. And of course, I needed to tell them about DITA because while DITA equals neither topic-based nor multi-channel publishing per se, it has become something of a de facto standard for topic-based information and there is a lot to be learned from it.

I remained largely neutral concerning DITA throughout the workshops, but nevertheless, I was forced to reconsider and, in some cases, re-evaluate some of my opinions. DITA is what it is, it is widespread and it is constantly being developed, and it cannot be ignored if discussing topic-based information solutions.

Take the strict topic orientation as a primitive example. One task, one topic. No dependencies, no context or hierarchy linking the topic to others from within the topic itself, no broken cross-references, et cetera. I have frequently dismissed parts of this as the inevitable consequences of ill-designed systems, but as I was highlighting practical examples from my client's current information, I did see the value of the concept of a single, isolated task beyond mere system limitations. See, while a system does help if implemented properly, any dependencies in the information will nevertheless make it more difficult to maintain and update if used in several different contexts. I could clearly see this happen with my client's documentation, and while I'm not at liberty to discuss any specifics, theirs was a very good case for minimalism.

More obvious, perhaps, were the strategies implied by DITA concerning online documentation. If publishing for a smartphone, for example, it is obvious that size does matter. There is no room for large overviews or tables, nor is there a place for long narratives. There is no way to know how the reader arrived at the current topic so there is no way to give that narrative, or a longer list of contents or a list of related topics that aren't essential but nice to have, etc. There are obvious implications on large content, including eliminating those pesky overviews, but also on how to present single, self-sufficient topics.

You have to make every such topic completely independent from the next or the previous ones, because there is no way to know what the next or previous ones were about. The limited space needs to focus on solving the task at hand so giving references and links is tricky at best.

As the topic is included in a publication later on, in DITA maps, and always in a specific context, the target format is only known when creating the publication, and therefore DITA maps are the logical place to include any such references in. Maps provide a logical place to address anything context-related, including hierarchies, references, etc.

DITA is certainly not the only way to achieve strict topic orientation, but it is relatively unique in offering a comprehensive method for achieving it, including minimalist concepts, online documentation requirements, etc, in one place. One could argue the merits of something like S1000D for purpose-filled topical documentation, but while S1000D is many things, I doubt it will ever be accused of minimalism. And these days, DITA is expanding outside its original box within software documentation and, increasingly, solving problems in new domains.

DITA brings with it a number of challenges (that's the same as "problems" but in presales-speak), of which many have to do with how to restore some of the inherent readability of sequential content meant for paper-based books, and I remain unconvinced in this regard. Markup-wise, the DTD leaves room for improvement, and I think there are better ways to design linking mechanisms (even though DITA includes some clever ID-related tricks). I think specialisation suffers because the original DTD suffers, and I think DITA struggles when it comes to profiling information.

But just as DITA is not the only XML-related standard to offer topic orientation and reuse, it is not the only one with problems. It is perhaps too easy for a grumpy old XML guy like me to dismiss DITA because I find problems in its execution, because there is a lot of good things in it, too, and this blog entry is my way of saying that I am reconsidering.

Who says you can't teach old dogs new tricks? Next I'll be embracing Java.

No comments: