Converting e-Books to Open Formats
Pyrite Publisher is designed mainly to go from normal HTML or text files to the Palm platform, not the other way around. The procedure discussed above is not really scalable to scenarios such as converting a great quantity of Palm e-books to customized HTML, with hyperlinks and metadata included. In such cases, the best solution might be a Perl script combining the standard XML or HTML modules for this language with the P5-Palm bundle; these are available from the Comprehensive Perl Archive Network (see the on-line Resources). The P5-Palm set of modules includes classes for reading, processing and writing the .pdb and .prc database files used by PalmOS devices.
RocketBook e-books have several interesting characteristics, including support for compressed HTML files and indexes containing a summary of paragraph formatting and the position of the anchor names. These and many more details on .rb file internals are explained in the RB format page listed in the on-line Resources. Rbmake Rocket Ebook and Mobipocket files can be disassembled with a set of command-line tools called Rbmake. Its home page offers source code, binary packages, a mailing list and contact information to report bugs. To use rbmake, you need libxml2, version 2.3.1 or higher; the pcre (Perl-Compatible Regular Expressions) library; and zlib, to handle compression. To compile from source—at least on Fedora Core 2—it also is necessary to install separately the pcre-devel package.
A nice feature of Rbmake is the source code is structured in a modular manner. An entire library of object-oriented C routines can be compiled and linked independently from the rest of the package from any other program dealing with .rb files. In this way, should you want to write your own super-customized Rocket Ebook converter or simply index all of your e-books into a database, you would need to use only the piece that actually knows how to read and write the .rb format, the RbFile class. This chunk of code opens the file, returns a list of the sections composing the book and uncompresses on the fly only the ones actually required by the main program. Should you need them, the library also includes functions to match and replace parts of the content through Perl-compatible regular expressions.
The Rbmake tools should compile quickly and without problems on any modern GNU/Linux distribution. Exhaustive HTML documentation also is included in the source tarball. The binary file able to generate HTML files is called rbburst. It extracts all the components—text, images and an info file—present in the original .rb container. Figure 2 shows, in two separate Mozilla Windows, the cover page and the table of contents of the file generated by rbburst when run on The Invisible Man by H. G. Wells.
Microsoft's Reader files, recognizable by the .lit extension, have many of the characteristics of traditional books, including pagination, highlighting and notes. They also support keyword searching and hyperlinks, but they are locked in to one reader platform.
The tool for converting these files is called, simply, Convert Lit. Running the program with the -help option lists, according to UNIX tradition, all the available command-line options. This program has three modes of operation: explosion, downconversion and inscribing. Explosion is the one needed to convert an existing .lit file to an OEBPS-compliant package. OEBPS (Open eBook Publication Structure) is covered later in the article.
Figure 3 shows a version of Shakespeare's A Midsummer's Night Dream obtained by using explosion from the Convert Lit program. Downconversion is the opposite process; it generates a .lit file for use by a Microsoft Reader-compliant device. Inscribing is when the downconversion attaches a user-defined label to the .lit file. The exact syntax is explained on the program's home page (see Resources).
We already mentioned that Convert Lit creates an OEBPS package made of different files. Here is the complete list for the example above: Contents.htm, copyright.html, ~cov0024.htm, cover.jpg, MidSummerNightDream.opf, MobMids.html, PCcover.jpg, PCthumb.jpg, stylesheet.css and thumb.jpg. HTML, CSS and JPG files were to be expected, but what is the .opf file? It is an XML container describing the structure and several portions of the original book's metadata. The extension OPF stands for open electronic book package format. The OPF file contains references to the other pieces of the e-book, as well as descriptions of their attributes. To have a clearer idea of its role, a short excerpt of MidSummerNightDream.opf is shown in Listing 2.
Articles about Digital Rights and more at http://stop.zona-m.net CV, talks and bio at http://mfioretti.com
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Stunnel Security for Oracle
- SourceClear Open
- SUSE LLC's SUSE Manager
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- My +1 Sword of Productivity
- Managing Linux Using Puppet
- Tech Tip: Really Simple HTTP Server with Python
- Non-Linux FOSS: Caffeine!
- Google's SwiftShader Released
- Doing for User Space What We Did for Kernel Space
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide