A DocBook Toolchain

I should outline here the basics of the toolchain I had evaluated. It was based on DocBook. A two sentence introduction to DocBook can be found on the front page of the SourceForge DocBook Project. I reproduce it here in full:

DocBook is an XML vocabulary that lets you create documents in a presentation-neutral form that captures the logical structure of your content. Using free tools along with the DocBook XSL stylesheets, you can publish your content as HTML pages and PDF files, and in many other formats.

I would also like to highlight a couple of points from the preface to Bob Stayton's DocBook XSL: The Complete Guide—a reference which anyone actually using DocBook is sure to have bookmarked:

A major advantage of DocBook is the availability of DocBook tools from many sources, not just from a single vendor of a proprietary file format. You can mix and match components for editing, typesetting, version control, and HTML conversion.
The other major advantage of DocBook is the set of free stylesheets that are available for it... These stylesheets enable anyone to publish their DocBook content in print and HTML. An active community of users and contributors keeps up the development of the stylesheets and answers questions.

So, the master document is written in XML conforming to the DocBook DTD. This master provides the structure and content of our document. Transforming the master into different output formats starts with the DocBook XSL stylesheets. Various aspects of the transformation can be controlled by setting parameters to be applied during this transformation (do we want a datestamp to appear in the page footer?, should a list of Figures be included in the contents?), or even by writing custom XSL templates (for the front page, perhaps).

Depending on the exact output format there may still be work for us to do. For HTML pages, the XSL transformation produces nicely structured HTML, but we probably want to adjust the CSS style sheet and perhaps provide our own admonition and navigation graphics. For Windows HTML Help, the DocBook XSL stylesheets again produce a special form of HTML which we must then run through an HTML Help compiler.

PDF output is rather more fiddly: The DocBook XSL stylesheets yield XSL formatting objects (FO) from the DocBook XML master. A further stage of processing is then required to convert these formatting objects into PDF. I used the Apache Formatting Objects Processor (FOP), which in turn depends on other third-party libraries for image processing and so on.

As we can see, there are choices to be made at all stages: which XSL transform software do we use, which imaging libraries; do we go for a stable release of Apache FOP or the development rewrite? Do we spend money on a DocBook XML editor? Since we have full source access for everything in the chain we might also choose to customise the tools if they aren't working for us.

These choices were, to start with, a distraction. I was happy to go with the selection made by the Hibernate team unless there was a good reason not to. I wanted the most direct route to generating decent documentation. I kept reminding myself that content was more important than style (even though the styling tools were more fun to play with).

Copyright © 2006 Thomas Guest