Building Impress and PowerPoint Slides with LaTeX and Perl

Forced to use proprietary file formats? Let open source ease the burden.

The title_slide subroutine returns raw XML, which is inserted into the document.

Given an input file conforming to the textual content produced by getcontent, the produce_slides script clones the blank.sxi Impress file and populates any number of slides, programmatically producing a presentation. The script is not unlike getcontent in structure, its only warts being the verbatim inclusion of the required XML for each of the three slide types contained within blank.sxi. To create a presentation, invoke produce_slides as follows:

perl produce_slides 3 chapter3.input

This results in a new Impress document called chapter3.sxi appearing on disk.

With the Impress files created, I needed to replace my graphic image placeholders with the actual image. The getcontent script extracted the image filename, however, not the actual image. Importing the images into Impress should have been straightforward, except that the originals I had were of pretty poor quality compared to those that made it into the book. The final images had been improved greatly during the publisher's final typesetting phase. And, of course, I didn't have the final image files.

Then I remembered that the publisher had sent final proof PDFs with all the high-quality graphic images in place. I used xpdf to view the proofs at 200% and then fired up The GIMP to screen-capture the xpdf display window. I then cut out the graphic image and saved it as a JPEG. It took a little while, but when finished I had a beautiful set of book-quality images to import into my Impress presentations. With this task complete, I exported the Impress document to PowerPoint format and the job was done. My initial estimate of 20 days of effort was reduced to about 20 hours of real work.

And now, of course, if I need to produce some slides quickly, I can create my textual content manually in vi, run it through the produce_slides script and I'm done.

Final Words

What started off as a seemingly impossible task—programmatically producing PowerPoint presentations—turned out to be quite possible, thanks to open source. All the tools I needed shipped out of the box with my stock Red Hat 9 distribution: vi, unzip, Perl, xmllint, xpdf, The GIMP and the OpenOffice.org suite.

Resources for this article: /article/8055.

Paul Barry (paul.barry@itcarlow.ie) lectures at the Institute of Technology, Carlow, in Ireland. Information on the courses he teaches, in addition to the books and articles he has written, can be found on his Web site, glasnost.itcarlow.ie/~barryp.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Great ideas, thanks!

Anonymous's picture

Great ideas, thanks!

getcontent script

Norberto's picture

I am not my self a perl programmer. A way to obtain a workable getcontent script?

Best

Missing getcontent script

barryp's picture

Sorry ... the script appears to be missing from the download. Here it is:

#! /usr/bin/perl -w

#
# The "getcontent" script: Given a LaTeX file on the command-line,
# extract it's textual content.
#
# By Paul Barry, paul.barry@itcarlow.ie
#

use strict;

use constant TRUE => 1;
use constant FALSE => 0;

my $in_verbatim = FALSE;
my $in_maxim = FALSE;
my $graphic_name = '';

while ( <> )
{
if ( $in_maxim )
{
if ( /\\end\{maxim\}/ )
{
print "STOPMAXIM\n";
$in_maxim = FALSE;
}
else
{
print;
}
next;
}

if ( $in_verbatim )
{
if ( /\\end\{verbatim\}/ || /\\end\{alltt\}/ )
{
print "STOPCODE\n";
$in_verbatim = FALSE;
}
else
{
print;
}
next;
}

if ( /\\chapter\{(.*)\}/ )
{
print "CHAPTERTITLE: $1\n"; next;
}

if ( /\\section\{(.*)\}/ )
{
print "BULLETTITLE: $1\n"; next;
}

if ( /\\subsection\{(.*)\}/ )
{
print "BULLETCONTENT: $1\n"; next;
}

if ( /\\begin\{verbatim\}/ || /\\begin\{alltt\}/ )
{
print "STARTCODE\n";
$in_verbatim = TRUE; next;
}

if ( /\\begin\{maxim\}/ )
{
print "STARTMAXIM\n";
$in_maxim = TRUE; next;
}

if ( /images\/(.*?)\}/ )
{
$graphic_name = $1; next;
}

if ( /\\caption\{\\label\{/ )
{
/label\{.*?\}(.*)\}\}/;
print "GRAPHICCAPTION: $1\n";
print "GRAPHICNAME: $graphic_name\n"; next;
}

if ( /^\\textit\{(.*)\}/ )
{
print "CHAPTERCONTENT: $1\n"; next;
}
}

Paul Barry

Some important updates to the OpenOffice::OODoc module

barryp's picture

Jean-Marie Gouarné contacted me via e-mail with some updates on the status of his excellent Perl module. Here's what he said:

Thanks for this article. It's very useful for evangelization about the OOo XML format... And (that is much less important) thanks for your test with my OpenOffice::OODoc module!

However, I've just 2 remarks about your quotation of this Perl module:

1) OpenOffice::OODoc *can* create new OOo files (texts, spreadsheets, presentations and drawings) from scratch; this feature is available since version 1.201 (2004-07-30). To do so, the ooDocument() constructor must be called with a create => $class option (where $class is the document class, i.e. "text", "spreadsheet", etc).

2) The module has notably evolved in the meantime; now it supports both the OpenOffice.org 1.0 and the OpenDocument formats; in addition, there are a few draw- or impress-focused methods (so, for example, such methods as insertDrawPage or appendDrawPage are available in order to organize and copy presentation slides). But you were right when you said that "the module was created with a view to working primarily with OpenOffice.org Writer files". Text documents were and remain the main target.

I thought it worthwhile to post his message here. Thanks.

--
Paul Barry
IT Carlow, Ireland
http://glasnost.itcarlow.ie/~barryp

Paul Barry

Writing to Impress from Perl

Michelle Chang's picture

Easy way to write to Impress/Powerpoint as Jean-Marie said:

#! /usr/bin/perl -w

use strict;
use OpenOffice::OODoc;

# start a new preso
my $preso = ooDocument(file => 'test.sxi', create => 'presentation');

my $slide = $preso->getDrawPage(0); # slide 0

$preso->createTextBox
(
attachment => $slide,
size => '10cm, 2cm',
position => '1cm, 2cm',
content => 'I want to write to Impress from Perl'
);

$preso->save;

Programmatic Conversions?

Jordan's picture

Thanks for the great article! Do you know if there is a way to programmatically convert the resulting impress document to PowerPoint? Perhaps as in $preso->export(...)?

Or would I need to use something like the Python-UNO bridge to do so?

image extraction from pdf's

girvim01's picture

Would "pdfimages" (part of the xpdf package) have sped up the final step (extraction of images from the pdf page proofs)?

using pdfimages?

barryp's picture

> sped up the final step?

maybe ... if I had known about it! :-)

Thanks - I'll check out pdfimages.

Paul.

Paul Barry

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix