Monday 25 August 2008

DITA

For the last year or three, XML editor makers have been busy coding DITA customizations into their products. The latest editor to get DITA is Oxygen, my XML IDE of choice these days. It's the latest fad, see, and there's money to be made.

But I'm not convinced, and here's why:

DITA claims to make life easier for users by splitting documents into smaller, reusable pieces, hinting that this is a fresh, new approach to documentation. It's not, however; some of us have done this for years in our DTDs, long before XML was even thought of, simply because that's one of the main points with structured information. It's the sensible thing to do, a good reason to why structured information is useful in the first place.

Now, this is all good and well, but because DITA needs to appeal to a lot of users, it is a generic structure, and it's big. Both of these things are unfortunate since bigger means more difficult to learn, both for users and developers, and generic means that to apply the structure to your specific needs, an abstraction (customization) level is needed.

Generic also means that any markup specific to one user's needs will have to be added, which means more customization.

With DITA comes a package of stylesheets and utilities, also big and generic, hard to learn, and in need of customization, not only to add the user-specific requirements, but also to modify their look and feel. After all, you don't really want to have your documents look like the next guy's, do you?

See, what the DITA advocates are saying, basically, is that either you do want that, or you need to customize.

My view of document structures is just the opposite, really. I'd much rather go with writing a customer-specific DTD, if at all possible, just as I'd go for customer-specific stylesheets and other customizations and tweaks. In that way, I could make the structures, utilities, and stylesheets immediately relevant to the customer, thereby saving time spent trying to learn a generic structure and then trying to apply it to your needs.

That customer-specific DTD will practically always be smaller than any generic one; I know every single DTD I've ever created has been, including the package of DTDs I wrote for a large automobile manufacturer for all of their aftersales documentation. At the same time the DTD will most likely be far more relevant, far better fine-tuned, for the customer's needs.

And yet, it would be just as easily customizable as DITA or some other "standards-based DTD".

When I'm lecturing on XML and document structure management, I always stress that we use XML because we like to convert XML to other formats, not because we want it to remain the same. If some other company needs DITA documents from us, fine! I doubt it, but if the need arises, it's easy, even trivial, to convert a customer-specific structure to a generic one.

See, DITA to me is just another DocBook. It's a standard, true, but it's just another standard among a thousand other standards. It's open, also true, but so are a thousand others. And of course it claims to be easily customizable, but that's obviously the case with those thousand other standards, too.

But it's also big and generic and not very relevant as such to any specific requirement, not without an abstraction level or two.

3 comments:

Anonymous said...

DITA is a collection of doctypes, rather than one large generic one, and the collection of doctypes is extensible - you can create new XML elements that inherit processing from existing supported ones, which is substantially faster than creating custom doctypes.

If you want something small, the core DITA topic type is about 100 elements (basic tables, lists, metadata etc). If you want something specific, you can create your own doctype, and have working output based on the existing processes in less than a day.

What you're seeing in Oxygen is support for the collection of doctypes provided as a starting point by OASIS. They represent a range of possibilities, from generic topic type, to specific task type, to comprehensive ditabase (all elements in one doctype).

The key points for DITA are:
- topic orientation (which it obviously did not invent, but does provide architectural support for);
- map-based processing (managing related links, topic reuse, and heading level/chapter organization outside of content);
- and specialization (letting you create new elements in DITA without creating custom support every time).

Sum: I hope you'll take a second look at DITA. I suspect your initial assessment has missed some of its more interesting features.

Michael Priestley
Lead IBM DITA Architect
Editor, OASIS DITA Specification

Ari N said...

Thank you for your comments.

My main point has little to do with that initial up-and-running figure and everything to do with my conviction that a generic solution (DITA, DocBook... take your pick) is rarely as relevant to a specific company or organization as is a structure tailored for them. Any generic solution is a compromise, by definition, and therefore less suited to any specific needs, by definition.

Extending such a structure to allow for customization will necessarily add to the overall bloat, but the results may still not be as relevant as one would like. And yes, it is possible to be up and running within a day, but how well does anyone know the structures by then? To me, a core consisting of around 100 elements is not that small.

DITA's great advantage, as I see it, is that the basic semantics are easy to understand and reasonably well defined (task/concept/reference, right?), but that's not the same as an easy-to-understand, ready-for-action authoring environment, relevant to the customer's needs.

I feel the urge for another blog entry, clarifying this.

Anonymous said...

To clarify, specialization doesn't necessarily (or even usually) result in bloat. For example, I could use specialization to create a new topic type called "message_reference" which has six elements:

<message_reference id="A123">
<msgtext>Too close</msgtext>
<msgbody>
<msginfo>You went too close, and got into trouble.</msginfo>
<msgaction>Move away and you'll be fine.</msgaction>
</msgbody>
</message_reference>

I'm not suggesting this as a real example, but to give you a sense of what specialization can do. In other words, it can do most of the things you would want to do through customization, except that you don't need to create output support for PDF, HTML, reuse of variables, conditionalization for different platforms or audiences, cross-linking and navigation etc.

If you have no need to create output, share content, or single-source for multiple products or audiences, then specialization doesn't buy you much over customization. But it would be a mistake to think that DITA is limited to the OASIS standard types. The process of specialization is a major part of the standard.

Michael Priestley