[<<] [<] Page 1 of 1 [>] [>>] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Subject:
automation cleanup of source LDP tree (informational)
From: "Martin A. Brown" ####@####.#### Date: 27 Jan 2016 01:11:23 +0000 Message-Id: <alpine.LSU.2.11.1601261656560.2025@znpeba.jbaqresebt.arg> Hello, > 1. automation: Be able to (re-)process and (re-)publish all of > our existing documentation in an automated fashion. This is a description of work I have already accomplished and committed to my own git repository. Automation cleanup (source): ---------------------------- Many of the documents at git HEAD [0] in our main LDP/howto tree sport validation errors when processed with toolchains running on modern Linux releases (i.e., OpenSUSE-13.2 and Ubuntu-14.04.3). I have a (local) git repository with hundreds of corrections to source files in all formats (Linuxdoc, DocBook SGML and DocBook XML). I would characterize these corrections as non-editorial--i.e. they are technical only, to allow each document to validate and to allow the processor to generate outputs. The only substantive change I have made in the cleanups is to move any <graphic/>, <mediaobject/>, or <inlinemediaobject/> images into an ./images/ directory, which is copied to the HTML (output) tree. Otherwise, images are not visible in the output. Not desirable. My cleanup changes (about 200 commits) are at: https://github.com/martin-a-brown/LDP Since I doubt anybody wants to read through the entire git log, here's a shorter description of the various classes of changes that I have made to the individual documents: * adding countless closing tags, such as </sect1>, </sect2>, </sect3>, </listitem>, </para>, </varlistentry> * switching to entities for reserved characters, e.g. & to &, <> to <>, [] to [], etc. (particularly where people had left email addresses in angle brackets) * renaming files containing XML from stem.sgml to stem.xml * character set encoding; using entities in ASCII, converting to Unicode with Byte Order Marker (BOM) where possible * corrections to many DOCTYPE definitions * "upgrading" DocBook versions when authors used elements or features from a newer DocBook standard (e.g. 3 * substituting dash for underscore in the id attribute ([open]jade refuses _ in id=) * commit in repo converted images (e.g. eps) files for documents (processors do not generate them on the fly; did they used to?) * adding XML/SGML comment closures -->, where accidentally omitted; removing stray '--' which was confusing SGML/XML processors * wrapping large blocks of <programlisting/> code with <![CDATA[]]> * replacing non-DocBook XML elements with DocBook equivalents, i.e. <xlink:href/> becomes <ulink/>; replacing HTML elements <a href=""> with <url url=""> in Linuxdoc documents * removing extra (and sometimes empty) tags which confused the processor * and, probably many other small errors that jade or xsltproc complained about... I will observe that the vast majority of these corrections were on DocBook (both SGML and XML) files. Several Linuxdoc files required adding missing tags, correcting a few tag names and even a few entity corrections, as well. I guess that earlier SGML processors (or their operating configurations) were more forgiving of many of these errors. This message treats the cleanup needed only of the source tree. There is separate work for the cleanup of the output tree, lots of old documents that maybe should be in archived, etc. -Martin [0] https://github.com/martin-a-brown/LDP -- Martin A. Brown http://linux-ip.net/ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[<<] [<] Page 1 of 1 [>] [>>] |