discuss: Thread: progress update: 1. automation of source to output publication


[<<] [<] Page 1 of 1 [>] [>>]
Subject: progress update: 1. automation of source to output publication
From: "Martin A. Brown" ####@####.####
Date: 4 Mar 2016 20:39:35 +0000
Message-Id: <alpine.LSU.2.11.1603041237550.19013@znpeba.jbaqresebt.arg>

Hello all,

Here's an update on the progress I have made in creating a wrapper 
to publish all LDP documents, in some sort of automatable way.


Automation software
-------------------
I started by writing something in shell, but it quickly got rather 
unwieldy and tangled.  Therefore, I switched to Python, even though, 
many of the core features are called out in other programs (like 
'sgml2html', 'xsltproc', and 'html2text').

I have a minimally feature-complete package called 'python-tldp' 
[0], which provides a utility 'lpdtool' for validating (where 
relevant), processing, publishing and managing our documents. 
There's a README [1] and still a bit more TODO [2].

It is important that the 'ldptool' be flexible enough to support new 
document formats.  See below for further discussion of source 
doctype formats.

I would like to create a repository at github.com/tLDP which will 
contain this TLDP software (I chose, MIT license for the software).  
I would designate the TLDP code repository as the canonical location 
for the source to this software (since it is our tool).  I propose 
the following:

  https://github.com/tLDP/python-tldp

I'm open to different names, if 'python-tldp' seems like too broad.


Source Formats (technical support)
----------------------------------
We have discussed supporting plain text (with some minimum 
requirements for title, author, date, etc.).  We can support "plain 
text", although we will spell that "AsciiDoc".  See rationale:

I tried the AsciiDoc tools against plain text, and they just work. 
Thus, if we get a plain text submission, we can add the minimum 
markup required to turn the document into a valid (structured) 
AsciiDoc format.  This lowers the bar for markup/document authoring 
and separates the important content question from the 
language/format question.  (We should be able to spend our time 
focusing on the high quality content, not the opening/closing tags.)

In my travels over the last few weeks, through the wilds of 
Doculandia, I examined the toolchain for AsciiDoc (asciidoc and a2x, 
specifically), and I have come to the conclusion that we can easily 
support AsciiDoc, as well.

I hereby commit to extending 'ldptool' to support AsciiDoc, in the 
same way that I added support for DocBook 5.0.  See following...

There is no longer any technical hurdle in our choosing to support 
DocBook 5.x [3], if we desire to do so (and I think we should).

I asked about DocBook 5.x a few weeks back and Leo Noordergraaf 
mentioned that he had written the Assembly-HOWTO.xml in DocBook 5.x.  
I used his document as an example and I can process that document 
and generate the desired outputs (PDF, text, HTML chunked and 
single).

Therefore, I added DocBook 5.0 support in 'ldptool'.  This support 
is now there and working against Assembly-HOWTO.xml [4].


Proposed revision to supported document formats
-----------------------------------------------
I propose that we revise the accepted document formats as follows:

  * Linuxdoc
  * AsciiDoc
  * text
  * DocBook XML 4.x
  * DocBook 5.x
  * DocBook SGML 4.x
  * DocBook SGML 3.x (no longer accepted, though supported)

Again, all of this is in the service of encouraging high quality 
documentation.  It matters so little what the chosen markup language 
or text format is.  If the content is good and the license is 
compatible, let's take it.

I welcome any feedback,

-Martin

 [0] https://github.com/martin-a-brown/python-tldp

 [1] https://github.com/martin-a-brown/python-tldp/blob/master/README.rst

 [2] https://github.com/martin-a-brown/python-tldp/blob/master/TODO

 [3] There could be a bit of complexity if authors add other XML
     languages to their documents (e.g. SVG, MathML).  If our
     authors are only using XML elements in DocBook 5.0 namespace
     (and things like xlink), we should be good.  There will, no
     doubt be a bit more learning here.

 [4] The doctype declaration in Leo Noordergraaf's committed
     Assembly-HOWTO.xml called itself DocBook 4.5, but after
     changing that, adding the namespace declaration on the root
     element and a few other changes, it validates and can be used
     by the xsltproc toolchain to produce the desired outputs.

-- 
Martin A. Brown
http://linux-ip.net/
Subject: Re: progress update: 1. automation of source to output publication
From: David Lawyer ####@####.####
Date: 5 Mar 2016 23:47:40 +0000
Message-Id: <20160305234837.GD28454@daveslinux>

On Fri, Mar 04, 2016 at 12:40:39PM -0800, Martin A. Brown wrote:
> 
> Hello all,
> 
> Here's an update on the progress I have made in creating a wrapper 
> to publish all LDP documents, in some sort of automatable way.
> 
> 
> Automation software
> -------------------
> I started by writing something in shell, but it quickly got rather 
> unwieldy and tangled.  Therefore, I switched to Python, even though, 
> many of the core features are called out in other programs (like 
> 'sgml2html', 'xsltproc', and 'html2text').

I'm a little concerned that there might be some duplication of effort here.
The Lampadas project for LDP was to use Plone.  If we were to use Plone,
then it also provides for publishing LDP docs and in addition has
metadata.  To find out more about Lampadas (ceased development in 2004 due
to illness of the author, David Merrill) see:
https://github.com/tLDP/lampadas/tree/master/Lampadas and/or google
lampadas.  When looking over the lampadas folder bear in mind that it
likely contains python code for the non-Plone version of lampadas which
was rejected for incomplete object persistance (I only have a vague idea
of what this term means).  The metadata part per D. Merrill was created
for LDP by ibiblio (based on DublinCore) and is known as Open Metadata
Framework = OMF (not to be confused with the OMF video game).  This OMF is
used by Gnome, etc.

> 
> I have a minimally feature-complete package called 'python-tldp' [0],
> which provides a utility 'lpdtool' for validating (where relevant),
> processing, publishing and managing our documents.  There's a README [1]
> and still a bit more TODO [2].
> 
> It is important that the 'ldptool' be flexible enough to support new
> document formats.  See below for further discussion of source doctype
> formats.
> 
> I would like to create a repository at github.com/tLDP which will
> contain this TLDP software (I chose, MIT license for the software).  I
> would designate the TLDP code repository as the canonical location for
> the source to this software (since it is our tool).  I propose the
> following:
> 
>   https://github.com/tLDP/python-tldp
> 
> I'm open to different names, if 'python-tldp' seems like too broad.
> 
> 
> Source Formats (technical support) ---------------------------------- We
> have discussed supporting plain text (with some minimum requirements for
> title, author, date, etc.).  We can support "plain text", although we
> will spell that "AsciiDoc".  See rationale:
> 
> I tried the AsciiDoc tools against plain text, and they just work.
> Thus, if we get a plain text submission, we can add the minimum markup
> required to turn the document into a valid (structured) AsciiDoc format.
> This lowers the bar for markup/document authoring and separates the
> important content question from the language/format question.  (We
> should be able to spend our time focusing on the high quality content,
> not the opening/closing tags.)

I was going to suggested the same for linuxdoc.  The markup to be
submitted would be:
------------------------------------------------------------------------
<title> Serial HOWTO
<author>David S. Lawyer, ####@####.#### 
<date> v2.28 July 2015 

<sect>Introduction
The serial port, while obsolete on ...

<sect> ...
Use blank lines to separate paragraphs.  Perhaps garble the email to help
avoid getting spam.
____________________________________________________________________
Then a script is used by LDP to create a valid linuxdoc file: by enclosing the
submission in <article> </article> tags and then adding the LinuxDoc DTD
statement <!doctype linuxdoc system> on line one.  Also a <p> (paragraph>
tag must be added by the script to the start of first line of each
section.  One can make a vim script to do this by editing an actual file
in vim (an editor) and recording the edit for reuse as a script (a feature
of vim).  A sed script would perhaps be a fraction of a second faster.
The standard tools for processing LinuxDoc adds closing tags to all the
other tags shown above and the author is seldom aware of this "internal"
finding of omitted tags.

> In my travels over the last few weeks, through the wilds of Doculandia,
> I examined the toolchain for AsciiDoc (asciidoc and a2x, specifically),
> and I have come to the conclusion that we can easily support AsciiDoc,
> as well.
> 
I once looked at AsciiDoc and concluded that LinuxDoc was just as easy to
learn/use.  The AsciiDoc manual is huge compared to LinuxDoc but AsciiDoc
has better support for images and supports utf-8.  But wiki markup may be
easier than either AsciiDoc or LinuxDoc.  

Am I correct that there are barriers to validating a wiki markup off-line?
Does one need to install a wiki system on their home PC and then use it
for writing wiki docs offline?

> I hereby commit to extending 'ldptool' to support AsciiDoc, in the same
> way that I added support for DocBook 5.0.  See following...
> 
> Again, all of this is in the service of encouraging high quality
> documentation.  It matters so little what the chosen markup language or
> text format is.  If the content is good and the license is compatible,
> let's take it.

However, the markup becomes important when a an new potential author
considers taking over a doc.  If the markup is not familiar and/or takes
a non-trivial effort to learn (or use), the potential author may not
volunteer to take it over.

> I welcome any feedback,
			David Lawyer
[<<] [<] Page 1 of 1 [>] [>>]


  ©The Linux Documentation Project, 2014. Listserver maintained by dr Serge Victor on ibiblio.org servers. See current spam statz.