Converting e-Books to Open Formats
Pyrite Publisher is designed mainly to go from normal HTML or text files to the Palm platform, not the other way around. The procedure discussed above is not really scalable to scenarios such as converting a great quantity of Palm e-books to customized HTML, with hyperlinks and metadata included. In such cases, the best solution might be a Perl script combining the standard XML or HTML modules for this language with the P5-Palm bundle; these are available from the Comprehensive Perl Archive Network (see the on-line Resources). The P5-Palm set of modules includes classes for reading, processing and writing the .pdb and .prc database files used by PalmOS devices.
RocketBook e-books have several interesting characteristics, including support for compressed HTML files and indexes containing a summary of paragraph formatting and the position of the anchor names. These and many more details on .rb file internals are explained in the RB format page listed in the on-line Resources. Rbmake Rocket Ebook and Mobipocket files can be disassembled with a set of command-line tools called Rbmake. Its home page offers source code, binary packages, a mailing list and contact information to report bugs. To use rbmake, you need libxml2, version 2.3.1 or higher; the pcre (Perl-Compatible Regular Expressions) library; and zlib, to handle compression. To compile from source—at least on Fedora Core 2—it also is necessary to install separately the pcre-devel package.
A nice feature of Rbmake is the source code is structured in a modular manner. An entire library of object-oriented C routines can be compiled and linked independently from the rest of the package from any other program dealing with .rb files. In this way, should you want to write your own super-customized Rocket Ebook converter or simply index all of your e-books into a database, you would need to use only the piece that actually knows how to read and write the .rb format, the RbFile class. This chunk of code opens the file, returns a list of the sections composing the book and uncompresses on the fly only the ones actually required by the main program. Should you need them, the library also includes functions to match and replace parts of the content through Perl-compatible regular expressions.
The Rbmake tools should compile quickly and without problems on any modern GNU/Linux distribution. Exhaustive HTML documentation also is included in the source tarball. The binary file able to generate HTML files is called rbburst. It extracts all the components—text, images and an info file—present in the original .rb container. Figure 2 shows, in two separate Mozilla Windows, the cover page and the table of contents of the file generated by rbburst when run on The Invisible Man by H. G. Wells.
Microsoft's Reader files, recognizable by the .lit extension, have many of the characteristics of traditional books, including pagination, highlighting and notes. They also support keyword searching and hyperlinks, but they are locked in to one reader platform.
The tool for converting these files is called, simply, Convert Lit. Running the program with the -help option lists, according to UNIX tradition, all the available command-line options. This program has three modes of operation: explosion, downconversion and inscribing. Explosion is the one needed to convert an existing .lit file to an OEBPS-compliant package. OEBPS (Open eBook Publication Structure) is covered later in the article.
Figure 3 shows a version of Shakespeare's A Midsummer's Night Dream obtained by using explosion from the Convert Lit program. Downconversion is the opposite process; it generates a .lit file for use by a Microsoft Reader-compliant device. Inscribing is when the downconversion attaches a user-defined label to the .lit file. The exact syntax is explained on the program's home page (see Resources).
We already mentioned that Convert Lit creates an OEBPS package made of different files. Here is the complete list for the example above: Contents.htm, copyright.html, ~cov0024.htm, cover.jpg, MidSummerNightDream.opf, MobMids.html, PCcover.jpg, PCthumb.jpg, stylesheet.css and thumb.jpg. HTML, CSS and JPG files were to be expected, but what is the .opf file? It is an XML container describing the structure and several portions of the original book's metadata. The extension OPF stands for open electronic book package format. The OPF file contains references to the other pieces of the e-book, as well as descriptions of their attributes. To have a clearer idea of its role, a short excerpt of MidSummerNightDream.opf is shown in Listing 2.
Articles about Digital Rights and more at http://stop.zona-m.net CV, talks and bio at http://mfioretti.com
|Happy Birthday Linux||Aug 25, 2016|
|ContainerCon Vendors Offer Flexible Solutions for Managing All Your New Micro-VMs||Aug 24, 2016|
|Updates from LinuxCon and ContainerCon, Toronto, August 2016||Aug 23, 2016|
|NVMe over Fabrics Support Coming to the Linux 4.8 Kernel||Aug 22, 2016|
|What I Wish I’d Known When I Was an Embedded Linux Newbie||Aug 18, 2016|
|Pandas||Aug 17, 2016|
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Happy Birthday Linux
- The Great Software Schism
- New Version of GParted
- Updates from LinuxCon and ContainerCon, Toronto, August 2016
- ContainerCon Vendors Offer Flexible Solutions for Managing All Your New Micro-VMs
- What I Wish I’d Known When I Was an Embedded Linux Newbie
- A New Project for Linux at 25
- Tor 0.2.8.6 Is Released
- Blender for Visual Effects
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide