docbook: Updated XSL Stylesheets


Previous by date: 2 Jul 2003 17:50:08 -0000 Re: Updated XSL Stylesheets, Bob Stayton
Next by date: 2 Jul 2003 17:50:08 -0000 Re: Updated XSL Stylesheets, Bob Stayton
Previous in thread: 2 Jul 2003 17:50:08 -0000 Re: Updated XSL Stylesheets, Bob Stayton
Next in thread: 2 Jul 2003 17:50:08 -0000 Re: Updated XSL Stylesheets, Bob Stayton

Subject: Re: Updated XSL Stylesheets
From: Emma Jane Hogbin ####@####.####
Date: 2 Jul 2003 17:50:08 -0000
Message-Id: <20030702175005.GK1372@xtrinsic.com>

Apologies in advance for the length. I've moved the bottom bit up to the
top just in case people don't want to read my HTML suggestions. :)

> I'd like to work with you toward getting the DocBook
> stylesheets in better HTML compliance, when the 
> new parameter is implemented.

Cool. :) Please stay in touch via email or through this list when it's
time.

On Wed, Jul 02, 2003 at 10:09:03AM -0700, Bob Stayton wrote:
> On Tue, Jul 01, 2003 at 05:27:38PM -0400, Emma Jane Hogbin wrote:
> > On Tue, Jul 01, 2003 at 02:15:33PM -0700, Bob Stayton wrote:

> As one of the maintainers of the DocBook XSL stylesheets,
> I can confirm that they still output some older HTML tags like <b>.
> The debate has been about whether or not the stylesheets should
> produce usable output without requiring a CSS stylesheet.
> There is a sufficiently large installed base for the stylesheets
> that requiring CSS would be disruptive for some.

It depends on why <b>, <i> and <tt> are being used. Sometimes it's for
foramtting purposes. Here are a couple of examples of how I might change
things:
	- the word Author is wrapped in <i> at the very beginning of 
	  the document. In this case it is for formatting purposes only. I would
	  remove the <i> completely and allow formatting based on CSS only.
	- the <emphasis> DocBook tag is replaced by <i class="emphasis">. This
	  should be replaced by the HTML tag <em>.
	- <th ...><b>Revision History</b></th> By default <th> is centered and
	  bold. Remove the <b> tag which is used for formatting purposes only.
	- <b> is used for either class="userinput" or class="command". In both
	  cases this is for a formatting effect. <b> is for formatting so
	  technically this is correct, although I would probably use <span>
	  instead as <b> is deprecated (and therefore disappears as of XHTML
	  1.0 Strict)

> That said, the stylesheets now have a 'make.valid.html'
> parameter whose effect will be to clean up these remaining
> problems.  But it is only planned and not yet implemented
> in the stylesheets at this moment.  Probably by the
> next release.

The use of <b>, <i> and <tt> is completely acceptable with the correct
DTD! I just wanted to clarify that. It's more how they're being used.
Sometimes I don't feel it's the best choice of HTML markup given the 
content. Plus it's good to move away from elements that are deprecated.

> > 	- the page doesn't validate as HTML or as XHTML (as per the correct
> > 	  directory). The DOCTYPE is missing from the HTML version so
> > 	  validator.w3.org doesn't even bother trying.
> 
> Right.  Coming soon.  But probably 4.01 Transitional, not Strict. 

I'm cool with a validating Transitional document. :) There's also an XHTML
folder for the nwalsh files so you can pick between HTML and XHTML. I
think that's neat option to have.

> >         For the XHTML version
> > 	  there are namespaces put into elements that don't allow them.
> 
> This is a bug in the version of xsltproc you are using.
> Try a later version.

Bah. Debian did all the installing and that is the latest version
according to unstable. This is the version I'm using:
emmajane@debian:/web/ref$ xsltproc -version
Using libxml 20507, libxslt 10030 and libexslt 720
xsltproc was compiled against libxml 20507, libxslt 10030 and libexslt 720
libxslt 10030 was compiled against libxml 20507
libexslt 720 was compiled against libxml 20507

> > 	- output in firebird-mozilla has a weird character after section
> > 	  numbers and before the text. It's a capital A with a circonflex
> > 	  (hat) accent. Also visible in the plain text output -- perhaps the
> > 	  character encoding meta information is incorrect?

Update: it also translates <quote> into something that lynx can read by
firebird-mozilla (and less/more) can't read.

> > 	- same complaints as before re. new lines, but it's much better this
> > 	  time.
> 
> The HTML stylesheets use <xsl:output indent="no"/>
> for various reasons.  Unfortunately, that indent
> attribute cannot be set by a runtime stylesheet parameter.
> But the ldp customization could change it.
> The custom xsl:output could also produce a DOCTYPE
> declaration, if you like.

Will look into this.

> > 	- there is still some HTML which could be stripped out, I think. For
> > 	  example: <div class="titlepage"><div><div><h3 class="title">
> > 	  <a id="id2800871"></a>8.1.? Unpack</h3></div></div><div>
> > 	  What are the extra <div>s for?
> 
> The DocBook stylesheets have a pretty complex system for
> generating headings, using a general "title page" system
> that provides a lot of optional control.  It leads to nested
> structures, some of which may not appear in your output.

*nod* I thought it would be something like that. Sometimes this is the way
things are with automated procedures.

> > 	- <div class="blockquote"><table border="0" width="100%"
> > 	  cellspacing="0" cellpadding="0" class="blockquote" summary="Block
> > 	  quote">
> > 	  Argh!! Why not just use the HTML element "blockquote"?
> 
> The HTML table is used to format the placement of the
> attribution child of the DocBook blockquote element.
> If you don't use an attribution, you should get the HTML
> <blockquote> element.

This would be a *perfect* candidate for something you could do with CSS.
<blockquote>
<div style="text-align: right">attribution</div>
</blockquote>

Just checked the XHTML 1.0 Transitional DTD and this is legal. I'd need to
check the HTML 4 one as well but I'm sure it would be fine. Ultimately the
style attribute would be replaced by class="attribution" and the
attribution class would be styled however you wanted.

> > 	- <div class="sect1" lang="en" xml:lang="en">
> > 	  are sections some times different languages than the parent
> > 	  document? I don't think this is necessary as my guess is that
> > 	  documents are always written in a single document. Why not simply
> > 	  put the language in the <html> start tag and be done with it?
> 
> As far as I can tell, there are only two places where the
> DocBook XSL stylesheets output a 'lang' attribute, for a
> blockquote and foreignphrase if it carries a lang or
> xml:lang attribute.  They don't output lang on the
> root element, and they don't output xml:lang at all. 
> I don't know where that's coming from.

Hmm, I will need to look into this. As a side note: for accessibility purposes 
all documents should have their natural language identified in the <html>
(root) element. http://www.w3.org/TR/WCAG10/#gl-abbreviated-and-foreign
This is to help screen reading software identify which language they are 
reading. I think it would be neat if the documents validated not only as
HTML but also validated under the Web Accessibilty Initiative (WAI)
guidelines. These guidelines are mandatory for many government documents
and are the base for the American Section 508 Guidelines on accessiblity.

As a second side note: I think there should be a little bit of attention
put into the meta data that is output. (I'm notoriously lazy on this point 
when authoring my own documents.) It would be easy to add at least an
author. 

emma

-- 
Emma Jane Hogbin
[[ 416 417 2868 ][ www.xtrinsic.com ]]

Previous by date: 2 Jul 2003 17:50:08 -0000 Re: Updated XSL Stylesheets, Bob Stayton
Next by date: 2 Jul 2003 17:50:08 -0000 Re: Updated XSL Stylesheets, Bob Stayton
Previous in thread: 2 Jul 2003 17:50:08 -0000 Re: Updated XSL Stylesheets, Bob Stayton
Next in thread: 2 Jul 2003 17:50:08 -0000 Re: Updated XSL Stylesheets, Bob Stayton


  ©The Linux Documentation Project, 2014. Listserver maintained by dr Serge Victor on ibiblio.org servers. See current spam statz.