docbook: XML v. SGML entities


Previous by date: 13 Nov 2002 06:03:57 -0000 Re: XML v. SGML entities, Greg Ferguson
Next by date: 13 Nov 2002 06:03:57 -0000 Re: XML v. SGML entities, Greg Ferguson
Previous in thread: 13 Nov 2002 06:03:57 -0000 Re: XML v. SGML entities, Greg Ferguson
Next in thread: 13 Nov 2002 06:03:57 -0000 Re: XML v. SGML entities, Greg Ferguson

Subject: RE: XML v. SGML entities
From: "Jim Weller" ####@####.####
Date: 13 Nov 2002 06:03:57 -0000
Message-Id: <000401c28ada$4d999690$d400a8c0@synergy>

Gang,
Good lord this is reverse engineering of the worst kind. I feel
enlightened, much like Socrates' people chained to the cave wall
watching shadows of reality dancing must have felt when they saw the
real world. And I wrote a bloody howto on and using these tools!

I have got three collections of entitity files. 

ISO* (ISO)
*.gml (gml)
*.ent (ent)

The parens tell how I will reference them.

There are 21 of each and 62 entities in the latin 1 file. I'm using the
latin-1 file as my basis for comparison. They are all describing the
same thing; lists of special characters/entities. The ISOs and gmls are
the same file line for line. 

That leaves us with 2 sets; the ents & ISO/gml (gmls). 

The ISO/gml files have lots of "<!ENTITY aacute SDATA "[aacute]"--=small
a, acute accent-->" type defintions with SDATAs. The gml group is the
one we all spend lots of time hand crafting catalogs and file names for.
They are the SDATA sgml type that Norm references in the url I posted
earlier. That's the data type expected by the command that I've
traditionally used to compile docbook xml (e.g. openjade -t xml -d
ldp.dsl#html xml.dcl howto.xml ).

But the ent files are HEX "<!ENTITY aacute	"&#x00E1;"> <!-- LATIN
SMALL LETTER A WITH ACUTE -->" type definitions.  These are the normal
files that are referenced in and distribuited with the docbook.cat
catalog file.

So, to clarify. The docbook distribution's catalog file points to a
series of *.ent files in the ent directory to define glyphs/entities.
Those entities by default are are in &#xHEX format. Openjade expects
that hex to be a function name (SDATA). So openjade freaks and prints
all those errors. Does jade do this?

I'd like to see this "fixed" such that either openjade handles the
&#xHEX references correctly or docbook catalog points to SDATA entity
definitions.

The obvious work arounds are 
1. Write a custom catalog that points to the correct entity files. See
Greg's docbook distro mentioned in his reply.(Thanks Greg!).
2. Replace the *.ent files with the *.gml file distro but changing the
.ent ending to .gml. The name change is because that's file extension
the docbook.cat from the docbook distro expects. Tested this. It works.

I've also recently tried using the xsl stylesheets with xsltproc. It
definetely gave me output, but in one (ugly) html file. I think the
dsssl style sheet transformation was more elegant and navigable. I've of
course heard that there is chunky xsl sheet out there somewhere. I'll
look into that.

I think I've got it down to openjade, docbook-dsssl, ldp-dsssl,
docbook-dtd, and the SDATA entities (renamed or cataloged). That even
feels excessive. I should just be able to use the entities that come
with the dtd. I'm really just interested in the simplest light-weight
processing environment possible. I mean as an author I just care about
my markup being correct and complete. I'm assuming that there are a
bug-a-jillion people out there that can process my original content
their particular way.

Thoughts?
Jim

-----Original Message-----
From: Greg Ferguson ####@####.#### 
Sent: Tuesday, November 12, 2002 8:03 AM
To: Jim Weller; ####@####.####
Subject: Re: XML v. SGML entities


On Nov 12,  7:58am, Jim Weller wrote:
> Subject: RE: XML v. SGML entities
> Greg,
> As always thanks for the prompt and pertinent reply.
>
> I grabbed your distro and took a look, but I'm also trying to 
> understand the workings here. I notice that you put the ISO* entity 
> files in the same folder as the DTD and the *.gml files in the ents/ 
> sub folder. Where are those referenced by catalogs? I notice that 
> dbcentx.mod has matching "ISO*.module" type lines does that cover the 
> ISO* files? But where are the *.gml files mentioned?

They aren't mentioned. This was discovered while I was trying to find a
solution for the "is not a function name" XML entity problem.

Also, some of the entity-related files in the DTD directory (and perhaps
in the ents directory) might be "cruft". I never bothered to clean
everything up (sorry about that!). I wanted to get to the solution...!

> OK. Looking closer it looks like you build a custom catalog (catalog) 
> that is different from docbook.cat to reference the *.gml files.

Correct.

> What is your SGML_CATALOG_FILES? Do you use both docbook.cat and 
> catalog?

No, both are not needed. I point SGML_CATALOG_FILES at 'catalog' (my
modified version) and the jade dsssl 'catalog'.

If you wish to see how this is setup for the LDP (the document
publishing script), grab the following:

http://tldp.org/authors/tools/ldp_mk

With this script I can publish all (LDP) supported DTDs (XML or SGML).
Email w/any questions.

regards-



-- 
Greg Ferguson    * SGI principal engr / LDP contributor
SGI Tech Pubs    * http://techpubs.sgi.com/ | gferg(at)sgi.com
Linux Doc Project* http://tldp.org/         | gferg(at)metalab.unc.edu

______________________
http://lists.tldp.org/



Previous by date: 13 Nov 2002 06:03:57 -0000 Re: XML v. SGML entities, Greg Ferguson
Next by date: 13 Nov 2002 06:03:57 -0000 Re: XML v. SGML entities, Greg Ferguson
Previous in thread: 13 Nov 2002 06:03:57 -0000 Re: XML v. SGML entities, Greg Ferguson
Next in thread: 13 Nov 2002 06:03:57 -0000 Re: XML v. SGML entities, Greg Ferguson


  ©The Linux Documentation Project, 2014. Listserver maintained by dr Serge Victor on ibiblio.org servers. See current spam statz.