discuss: Thread: Linux Dictionary Help Required


[<<] [<] Page 1 of 1 [>] [>>]
Subject: Linux Dictionary Help Required
From: "Binh Nguyen" ####@####.####
Date: 3 Jul 2003 11:45:20 -0000
Message-Id: <Law15-DAV58IwGAGoML0000d8df@hotmail.com>

Hello out there,

I'm currently getting editing for preliminary publication. Does anyone know
of a way in which to convert the thing to XML/SGML without going insane =)
(please remember that its about 1300 pages). Its currently organised thus,

Meaning (Tab) Really long line which is the definition
New-line
Meaning (Tab) Really long line which is the definition
New-line
Meaning (Tab) Really long line which is the definition
etc....


Thanks,
Binh.
Subject: Re: Linux Dictionary Help Required
From: chris albert ####@####.####
Date: 3 Jul 2003 12:17:33 -0000
Message-Id: <3F041EA1.9090702@mcgill.ca>

Binh

>Hello out there,
>
>I'm currently getting editing for preliminary publication. Does anyone know
>of a way in which to convert the thing to XML/SGML without going insane =)
>(please remember that its about 1300 pages). Its currently organised thus,
>
>Meaning (Tab) Really long line which is the definition
>New-line
>Meaning (Tab) Really long line which is the definition
>New-line
>Meaning (Tab) Really long line which is the definition
>etc....
>
>  
>
This should be easy. I suggest a little perl script that reads the file
from standard input,
spilts each line on the tab, sets two variables $meaning and $def, and
then inserts those
variables into the appropriate xml text , say using 'glossary' tags,
printing to a new file.
Then you can just add the superstructure to that file, to get the final
doc, or an external
file you can include in your doc.

This is possible also with a shell script using sed/awk, or with macros
in an editor like vim,
but I think perl is easiest.

Chris

Subject: Re: Linux Dictionary Help Required
From: Charles Curley ####@####.####
Date: 3 Jul 2003 12:23:13 -0000
Message-Id: <20030703122316.GB4413@charlescurley.com>

On Thu, Jul 03, 2003 at 09:43:24PM +1000, Binh Nguyen wrote:
> Hello out there,
> 
> I'm currently getting editing for preliminary publication. Does anyone know
> of a way in which to convert the thing to XML/SGML without going insane =)
> (please remember that its about 1300 pages). Its currently organised thus,
> 
> Meaning (Tab) Really long line which is the definition
> New-line
> Meaning (Tab) Really long line which is the definition
> New-line
> Meaning (Tab) Really long line which is the definition
> etc....

I would do it with a few one-off macros in Emacs.

I assume you want an alphabetic sort. You may have terms that need
special sort keys which are not the term defined. In that case, I'd
write a macro to take the term and duplicate it, like so:

key TAB term TAB definition NEW LINE

Then you can edit the sort keys as needed. e.g:

1 TAB One TAB The first integer after zero.
; TAB Semicolon TAB Character used in many languages to indicate the
end of a statement.
P TAB <p> TAB HTML Paragraph markup.

By putting the keys at the beginning of the line, you can sort on the
sort keys easily enough.

Next, an Emacs macro to encode the dictionary. I'd make the sort key
an SGML comment so that you can continue to sort on the keys after
you've encoded it. Something like:

<!-- 1       --><glossentry><glossterm>One</glossterm><glossdef><para>The first integer after zero.</para></glossdef></glossentry>

You should make all the comments the same length to make the sorting
easy.

-- 

Charles Curley                  /"\    ASCII Ribbon Campaign
Looking for fine software       \ /    Respect for open standards
and/or writing?                  X     No HTML/RTF in email
http://www.charlescurley.com    / \    No M$ Word docs in email

Key fingerprint = CE5C 6645 A45A 64E4 94C0  809C FFF6 4C48 4ECD DFDB

--> -->
 
 
<type 'exceptions.IOError'>
Python 2.5.2: /usr/bin/python
Fri Jul 5 06:46:31 2024

A problem occurred in a Python script. Here is the sequence of function calls leading up to the error, in the order they occurred.

 /opt/ezmlm-browse-0.20/<string> in ()
 /opt/ezmlm-browse-0.20/main.py in main()
  424 
  425         if path is not None:
  426                 main_path(path)
  427         else:
  428                 main_form()
global main_form = <function main_form at 0xa168c6c>
 /opt/ezmlm-browse-0.20/main.py in main_form()
  378         except ImportError:
  379                 die(ctxt, "Invalid command")
  380         module.do(ctxt)
  381 
  382 def main():
module = <module 'commands.showthread' from '/opt/ezmlm-browse-0.20/commands/showthread.pyc'>, module.do = <function do at 0xa171844>, global ctxt = {'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}
 /opt/ezmlm-browse-0.20/commands/showthread.py in do(ctxt={'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'})
    9         ctxt.update(ezmlm.thread(ctxt[THREADID]))
   10         header(ctxt, 'Thread: ' + ctxt[SUBJECT], 'showthread')
   11         do_list(ctxt, 'msgs', ctxt[MSGSPERPAGE], ctxt[MESSAGES],
   12                         lambda:sub_showmsg(ctxt, ctxt[MSGNUM]))
   13         footer(ctxt)
global sub_showmsg = <function sub_showmsg at 0xa1681ec>, ctxt = {'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}, global MSGNUM = 'msgnum'
 /opt/ezmlm-browse-0.20/globalfns.py in do_list(ctxt={'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}, name='msgs', perpage=10, values=[{'author': u'Binh Nguyen', 'authorid': 'emedbeghhljpgceecnlp', 'date': '3 Jul 2003 11:45:20 -0000', 'month': 200307, 'msgnum': 4667, 'subject': u'Linux Dictionary Help Required', 'threadid': 'edhmcgkbkpdelibbjmef', 'timestamp': 1057232720.0}, {'author': u'chris albert', 'authorid': 'oalgiegalcmfindiabbc', 'date': '3 Jul 2003 12:17:33 -0000', 'month': 200307, 'msgnum': 4668, 'subject': u'Re: Linux Dictionary Help Required', 'threadid': 'edhmcgkbkpdelibbjmef', 'timestamp': 1057234653.0}, {'author': u'Charles Curley', 'authorid': 'fbacfjfdkmpbdhgmbbhp', 'date': '3 Jul 2003 12:23:13 -0000', 'month': 200307, 'msgnum': 4669, 'subject': u'Re: Linux Dictionary Help Required', 'threadid': 'edhmcgkbkpdelibbjmef', 'timestamp': 1057234993.0}, {'author': u'Machtelt Garrels', 'authorid': 'ceejnklaecengajdijnf', 'date': '3 Jul 2003 15:59:03 -0000', 'month': 200307, 'msgnum': 4670, 'subject': u'Re: Linux Dictionary Help Required', 'threadid': 'edhmcgkbkpdelibbjmef', 'timestamp': 1057247943.0}, {'author': u'Chris Karakas', 'authorid': 'elfkhmdcfpcflhjkhpgf', 'date': '3 Jul 2003 20:48:48 -0000', 'month': 200307, 'msgnum': 4674, 'subject': u'Re: Linux Dictionary Help Required', 'threadid': 'edhmcgkbkpdelibbjmef', 'timestamp': 1057265328.0}, {'author': u'Binh Nguyen', 'authorid': 'emedbeghhljpgceecnlp', 'date': '5 Jul 2003 16:36:26 -0000', 'month': 200307, 'msgnum': 4678, 'subject': u'Linux Dictionary Help Required', 'threadid': 'edhmcgkbkpdelibbjmef', 'timestamp': 1057422986.0}, {'author': u'Chris Karakas', 'authorid': 'elfkhmdcfpcflhjkhpgf', 'date': '7 Jul 2003 14:50:50 -0000', 'month': 200307, 'msgnum': 4683, 'subject': u'Re: Linux Dictionary Help Required', 'threadid': 'edhmcgkbkpdelibbjmef', 'timestamp': 1057589450.0}], peritem=<function <lambda> at 0xa1719cc>)
  128                 write(template % ctxt)
  129                 if peritem:
  130                         peritem()
  131                 ctxt[ROW] += 1
  132 
peritem = <function <lambda> at 0xa1719cc>
 /opt/ezmlm-browse-0.20/commands/showthread.py in ()
    9         ctxt.update(ezmlm.thread(ctxt[THREADID]))
   10         header(ctxt, 'Thread: ' + ctxt[SUBJECT], 'showthread')
   11         do_list(ctxt, 'msgs', ctxt[MSGSPERPAGE], ctxt[MESSAGES],
   12                         lambda:sub_showmsg(ctxt, ctxt[MSGNUM]))
   13         footer(ctxt)
global sub_showmsg = <function sub_showmsg at 0xa1681ec>, ctxt = {'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}, global MSGNUM = 'msgnum'
 /opt/ezmlm-browse-0.20/globalfns.py in sub_showmsg(ctxt={'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}, msgnum=4669)
  229         format_timestamp(ctxt, ctxt)
  230         write(html('msg-header') % ctxt)
  231         rec_showpart(ctxt, msg, 0)
  232         write(html('msg-footer') % ctxt)
  233         ctxt.pop()
global rec_showpart = <function rec_showpart at 0xa1681b4>, ctxt = {'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}, msg = <email.message.Message instance at 0xa1c6f4c>
 /opt/ezmlm-browse-0.20/globalfns.py in rec_showpart(ctxt={'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}, part=<email.message.Message instance at 0xa1c6f4c>, partnum=1)
  205                 else:
  206                         for p in part.get_payload():
  207                                 partnum = rec_showpart(ctxt, p, partnum+1)
  208         else:
  209                 write(html('msg-sep') % ctxt)
partnum = 1, global rec_showpart = <function rec_showpart at 0xa1681b4>, ctxt = {'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}, p = <email.message.Message instance at 0xa1cd38c>
 /opt/ezmlm-browse-0.20/globalfns.py in rec_showpart(ctxt={'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}, part=<email.message.Message instance at 0xa1cd38c>, partnum=2)
  208         else:
  209                 write(html('msg-sep') % ctxt)
  210                 sub_showpart(ctxt, part)
  211         return partnum
  212 
global sub_showpart = <function sub_showpart at 0xa168144>, ctxt = {'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}, part = <email.message.Message instance at 0xa1cd38c>
 /opt/ezmlm-browse-0.20/globalfns.py in sub_showpart(ctxt={'HTTP_X_FORWARDED_SERVER': 'glitch', 'HTTP_REFE...HTTP_ACCEPT_ENCODING': 'gzip, br, zstd, deflate'}, part=<email.message.Message instance at 0xa1cd38c>)
  164         type = ctxt[TYPE] = part.get_content_type()
  165         ctxt[FILENAME] = part.get_filename()
  166         template = html('msg-' + type.replace('/', '-'))
  167         if not template:
  168                 template = html('msg-' + type[:type.find('/')])
global template = <function template at 0xa160e9c>, global html = <function html at 0xa160ed4>, type = 'application/pgp-signature', type.replace = <built-in method replace of str object at 0xa1cbdb0>
 /opt/ezmlm-browse-0.20/globalfns.py in html(name='msg-application-pgp-signature')
   40 
   41 def html(name):
   42         return template(name + '.html')
   43 
   44 def xml(name):
global template = <function template at 0xa160e9c>, name = 'msg-application-pgp-signature'
 /opt/ezmlm-browse-0.20/globalfns.py in template(filename='msg-application-pgp-signature.html')
   31         except IOError:
   32                 if not _template_zipfile:
   33                         _template_zipfile = zipfile.ZipFile(sys.argv[0])
   34                 try:
   35                         f = _template_zipfile.open(n).read()
global _template_zipfile = None, global zipfile = <module 'zipfile' from '/usr/lib/python2.5/zipfile.pyc'>, zipfile.ZipFile = <class zipfile.ZipFile at 0xa0f9a7c>, global sys = <module 'sys' (built-in)>, sys.argv = ['-c', '/opt/ezmlm-browse-0.20']
 /usr/lib/python2.5/zipfile.py in __init__(self=<zipfile.ZipFile instance at 0xa1c6f6c>, file='-c', mode='r', compression=0, allowZip64=False)
  337             self.filename = file
  338             modeDict = {'r' : 'rb', 'w': 'wb', 'a' : 'r+b'}
  339             self.fp = open(file, modeDict[mode])
  340         else:
  341             self._filePassed = 1
self = <zipfile.ZipFile instance at 0xa1c6f6c>, self.fp = None, builtin open = <built-in function open>, file = '-c', modeDict = {'a': 'r+b', 'r': 'rb', 'w': 'wb'}, mode = 'r'

<type 'exceptions.IOError'>: [Errno 2] No such file or directory: '-c'
      args = (2, 'No such file or directory')
      errno = 2
      filename = '-c'
      message = ''
      strerror = 'No such file or directory'