docbook: MS Word to XML


Previous by date: 15 Nov 2003 09:11:55 -0000 Re: MS Word to XML, Bob Stayton
Next by date: 15 Nov 2003 09:11:55 -0000 Re: MS Word to XML, Tabatha Marshall
Previous in thread: 15 Nov 2003 09:11:55 -0000 Re: MS Word to XML, Bob Stayton
Next in thread: 15 Nov 2003 09:11:55 -0000 Re: MS Word to XML, Tabatha Marshall

Subject: Re: MS Word to XML
From: Tabatha Marshall ####@####.####
Date: 15 Nov 2003 09:11:55 -0000
Message-Id: <1068887483.21346.122.camel@mysticchild>

Thanks for the info, Bob!

I tried a few tools, and was actually waiting for the Upcast license in
my email when I got this message from you.  

I loaded an MS Word doc into Upcast, and managed to get the settings
correct for conversion, learning that you first have to use the Upcast
DTD first, then noting it uses the resulting xml file to run it through
the DocBook DTD, which I understand way you are supposed to get a
DocBook output.

But something happened that I didn't expect.  Unfortunately, the MS Word
document properties are interpreted such that it ruined the metadata of
the resulting xml file.  It attempted to use unusual combinations of
nested tags to do things that would take me only one or tags to do in
good old XEmacs.

Since we have revieweres that don't have Linux but run Windows, I
followed the suggested link to www.morphon.com.

I am VERY VERY pleased with this program!

It's being offered by a free license.  It is able to parse and validate
DocBook just fine.  It also offers alternative views other than source
with markup tags, for those Windows users who aren't comfortable working
that way.  I found though, that other views seemed to hide url links
provided in the document.  

This seems to be a good solution for reviewers who are still only
comfortable using Windows applications, and will allow them to make
their revisions without adding any proprietary data to the source file. 
And since the reviewers are working from a copy of the original, we can
still easily provide diff files to the authors to compare against the
original, as we always send these with the revisions (at least that's
always been my practice).

I just wanted to make sure that both lists found out about this tool.  I
thought it might make the newer reviewers feel better about not having
Linux tools.

Thanks for all the help and references!
Tab


On Fri, 2003-11-14 at 10:09, Bob Stayton wrote:
> On Thu, Nov 13, 2003 at 04:54:55PM -0800, Tabatha Marshall wrote:
> > Hi all,
> > 
> > I've been exploring Windows (and Linux) solutions for transforming MS
> > Word documents into XML, preferably DocBook XML 4.2.
> > 
> > I tried XMLSPY, which can only be evaluated for 30 days, I've tried
> > Morphon, which is nice for working in XML, but I couldn't figure out
> > what to do with the MS Word doc.
> > 
> > When I've used my Linux tools to convert, I've ended up with an XML
> > file, but it's so awful thanks to all the junk in MS Word, it makes me
> > want to scrap it and just cut/paste everything in, writing the tags in
> > myself.
> > 
> > Anybody have better luck finding an easy way to convert?  Your
> > suggestions are most welcome, and the sooner the better.  I have a guide
> > that needs conversion to XML before month-end.
> > 
> > For the benefit of our reviewers, many of whom use Windows, please use
> > "Reply All" if you have ideas to share on this subject!
> 
> You could check the DocBookWiki tools page, which includes
> several "up" conversion tools:
> 
> http://docbook.org/wiki/moin.cgi/DocBookTools
> 
> I've used UpCast with some success.  It converts a Word
> file to an XML file in its own generalized UpCast DTD,
> and then you can get an XSL stylesheet from them that
> converts the UpCast document to a DocBook document.
> 
> Bob Stayton                                 400 Encinal Street
> Publications Architect                      Santa Cruz, CA  95060
> Technical Publications                      voice: (831) 427-7796
> The SCO Group                               fax:   (831) 429-1887
>                                             email: ####@####.####
-- 
Tabatha Marshall
Web: www.merlinmonroe.com
Linux Documentation Project Review Coordinator (http://www.tldp.org)
Linux Counter Area Manager US:wa (http://counter.li.org)


Previous by date: 15 Nov 2003 09:11:55 -0000 Re: MS Word to XML, Bob Stayton
Next by date: 15 Nov 2003 09:11:55 -0000 Re: MS Word to XML, Tabatha Marshall
Previous in thread: 15 Nov 2003 09:11:55 -0000 Re: MS Word to XML, Bob Stayton
Next in thread: 15 Nov 2003 09:11:55 -0000 Re: MS Word to XML, Tabatha Marshall


  ©The Linux Documentation Project, 2014. Listserver maintained by dr Serge Victor on ibiblio.org servers. See current spam statz.