discuss: ePUB format(s); description of options and hurdles


Previous by date: 18 Mar 2016 23:32:32 +0000 Re: ePUB format(s); description of options and hurdles, Leo Noordergraaf
Next by date: 18 Mar 2016 23:32:32 +0000 output tree wrangling (website content overhaul); archive proposal, Martin A. Brown
Previous in thread: 18 Mar 2016 23:32:32 +0000 Re: ePUB format(s); description of options and hurdles, Leo Noordergraaf
Next in thread: 18 Mar 2016 23:32:32 +0000 Re: ePUB format(s); description of options and hurdles, jdd

Subject: Re: ePUB format(s); description of options and hurdles
From: "Martin A. Brown" ####@####.####
Date: 18 Mar 2016 23:32:32 +0000
Message-Id: <alpine.LSU.2.11.1603181615520.12423@znpeba.jbaqresebt.arg>

Hello Leo (et alia),

>I came to like epub a lot as it allows to to carry my library in an 
>e-reader. So I do hope that tldp will support epub.

I am in agreement with you here.  It would be good to be able to 
support an epub format (any epub format).  Especially since we have, 
effectively dropped the (ahead-of-its-time) PluckerDB format.

>As far as generators go, it is unfortunate that not all accepted 
>source formats are easily converted. I understand your discomfort 
>regarding only partial support for epub.
>
>1) do not support epub,
>
>2) create a generator suite that can handle all accepted source 
>   formats,
>
>3) use multiple generators, one for each source format and perhaps 
>   some are not available yet.

Yes, that's our set of options.

>Going for 1) is a pity in my opinion.

I agree completely.  It would be better to offer partial support for 
one of the EPUB standards (1.0.1, 2.0.1 or 3.0.1) than to avoid it 
entirely.

>TLDP isn't really in the business of creating epub converters, 
>let's drop 2). 

Maybe.  I'm still thinking about that.  Option 2 is my most desired 
outcome, but it represents work that goes beyond the scope of TLDP 
(and, possibly beyond my capabilities).  But, a tool that could 
process arbitrary HTML (or XHTML) and turn it into an epub would be, 
also, generally useful.

>That leaves the third option. I suppose that at the moment there 
>are more pressing things to do that worry about a single output 
>format. 

But, better to have agreement on a plan, even if the plan has not 
yet been set in motion.  So, thank you for replying!

I think the least-effort path to support EPUB from our source 
collection would involve partial support for the EPUB 2.0.1, but I 
thought I'd wait before engaging in any effort, to see what other 
TLDP members thought and whether others knew of tools or efforts 
that are invisible to me.

I have not done any work on additional output format support in the 
last week or so, as I have moved on to the question of the output 
tree and overall tldp.org website organization.

>So my suggestion is to support epub as an output format for those 
>source formats where it is easily supported and strive to include 
>all source formats eventually or otherwise drop epub support 
>completely.

I hear one vote and recommendation for supporting EPUB 
opportunistically, wherever the input format allows.  In our case, 
that would probably mean EPUB outputs could be generated from 
Asciidoc and any of the DocBook XML formats, but not from any of the 
SGML-based formats (DocBook 3.x, DocBook 4.x and Linuxdoc).

Thank you for your thoughts, Leo,

-Martin

>> I have examined two ePUB specifications, both EPUB 2.0.1 (epub2) [0] 
>> and EPUB 3.0.1 (epub3) [1].  I have not studied EPUB 1.0.1 [2].
>> 
>> Here's what I have learned.
>> 
>> The standard for epub3 is newer and includes features that LDP is 
>> unlikely to use.  These include media overlay (which defines a 
>> format for synchronizing text and audio [3]) and content obfuscation 
>> (in lieu of full DRM).
>> 
>> While the docbook-xsl-stylesheets project (Bob Stayton) provides 
>> support and a handy README for generating epub3 content, there does 
>> not appear to be an (upstream, distribution-supplied) tool that can 
>> generate epub3.
>> 
>> Available tools:
>> 
>>   * xmlto generates epub1; only reads XML docs, for us, that would 
>>     mean support only for DocBook XML 4.x and DocBook XML 5.0
>> 
>>   * a2x generates epub1; internally, a2x converts asciidoc to 
>>     DocBook 4.5 XML before producing the epub
>> 
>>   * docbook-xsl-stylesheets can generate XHTML suitable epub3; 
>>     user still needs to package up the .epub file; would 
>>     mean support only for DocBook XML and asciidoc files
>> 
>> In addition to the question of epub3 vs. epub2, there's the problem 
>> of the HTML outputs from the SGML documents.  These are not XHTML 
>> and would need to be converted to XHTML before being included in any 
>> epub document.
>> 
>> My summary of the situation is roughly like this:
>> 
>>   * We could, probably, fairly easily support epub outputs for each
>>     the DocBook XML and Asciidoc formats.  Fastest solution would 
>>     probably be using xmlto.  But, that's no solution for the 
>>     Linuxdoc and DocBookSGML sources.
>> 
>>   * Convert the HTML outputs (from SGML sources) to XHTML.  Then, we 
>>     are building our own epub generation tool.  If so (and if I were 
>>     undertaking this, there's a pretty good-looking library called 
>>     python-epub which generates epub2.
>> 
>> I am definitely interested in this epub nonsense, but it seems 
>> there's quite a bit of work to support epub for our entire 
>> collection.  Partial support of our source set (XML sources) would 
>> not be as tricky (but somehow that bothers me a bit). I'd be 
>> interested in any thoughts people have about which of the many paths 
>> we wight take from here.
>> 
>> -Martin
>> 
>>  [0] http://idpf.org/epub/201
>>  [1] http://idpf.org/epub/301
>>  [2] http://www.digitalpreservation.gov/formats/fdd/fdd000054.shtml
>> 
>>  [3] Which reminds me of the old synchronized 78 rpm records that 
>>      had stories like Bozo the Clown Under the Sea.
>>      https://www.youtube.com/watch?v=lgJmBrW4D80
>> 
>>  [4] https://bitbucket.org/exirel/epub
>>      http://epub.exirel.me/  # -- in French
>> 
>
>
>______________________
>http://lists.tldp.org/
>
>
>

-- 
Martin A. Brown
http://linux-ip.net/

Previous by date: 18 Mar 2016 23:32:32 +0000 Re: ePUB format(s); description of options and hurdles, Leo Noordergraaf
Next by date: 18 Mar 2016 23:32:32 +0000 output tree wrangling (website content overhaul); archive proposal, Martin A. Brown
Previous in thread: 18 Mar 2016 23:32:32 +0000 Re: ePUB format(s); description of options and hurdles, Leo Noordergraaf
Next in thread: 18 Mar 2016 23:32:32 +0000 Re: ePUB format(s); description of options and hurdles, jdd


  ©The Linux Documentation Project, 2014. Listserver maintained by dr Serge Victor on ibiblio.org servers. See current spam statz.