Need to Know

Knowledge of DocBook is like a security clearance: the user is on a need-to-know basis. That is, to start working with DocBook in a properly configured environment, a user needs to know very little, but there is always something more out there to learn. This section addresses a few details of DocBook that the typical user needs to know to get a first DocBook document up and running. Details will be left to the reader to fill-in from other resources (see the section called “Resources”).

What makes an XML document a DocBook document? It is not difficult to write a "valid" XML document. The following example would constitute a valid XML document:

  
  <?xml version="1.0" encoding="ISO-8859-1"?>
  <book>
    <title>How CLP Won the West</title>
    <chapter>
      <title>In the Beginning</title>
      <para>
      There once was a large LP...
      </para>
    </chapter>
  </book>

This document is not much use, though, without some meaning for the tags in it. The DocBook DTD is what gives a document meaning. The following example works better, and constitutes a valid DocBook document:


  <?xml version="1.0" encoding="ISO-8859-1"?>
  <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
                  "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
  <book>
    <title>How CLP Won the West</title>
    <chapter>
      <title>In the Beginning</title>
      <para>
      There once was a large LP...
      </para>
    </chapter>
  </book>

The only difference is the document type declaration which states the document is meant to adhere to the standard described in the file http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd (see the section called “Resources” for where to read more about document type declarations the DocBook DTD). In other words, adding the extra line of code makes this little example a genuine DocBook document. In this case, the declaration uses an Internet address. However, in a properly configured environment, a network connection would not be necessary to work with the document, thanks to the "catalog" mechanism. A discussion of catalogs is beyond the scope of this document (see the section called “Resources” for more). Should catalogs not be properly configured on a given system, one could instead use a local path to the DTD in the document type declaration (e.g. /usr/share/docbook-xml42/docbookx.dtd).

Suppose the name of the file containing the example above is bookex.xml. To create a single HTML document from this file is as simple as typing one command:

$ xmlto html-nochunks bookex.xml

To create a multi-part HTML version is just as easy:

$ xmlto html bookex.xml

A final and very important DocBook topic is that of "entities". For the purposes of writing CLP documentation (i.e. what follows is a gross simplification), entities are a way of "#include-ing" one document into another, and of using certain special characters which would otherwise confuse the tools used to process DocBook documents. The simplest example of the latter is the < symbol, which is used to begin tags in XML. Rather than putting the character directly into the document text, an entity can be used. Specifically, one would use the string &lt; instead. The other use of entities, as suggested above, is to split a document into convenient pieces. This is demonstrated in the section called “DocBook and CLP, Perfect Together”.