Building Impress and PowerPoint Slides with LaTeX and Perl

Forced to use proprietary file formats? Let open source ease the burden.

The title_slide subroutine returns raw XML, which is inserted into the document.

Given an input file conforming to the textual content produced by getcontent, the produce_slides script clones the blank.sxi Impress file and populates any number of slides, programmatically producing a presentation. The script is not unlike getcontent in structure, its only warts being the verbatim inclusion of the required XML for each of the three slide types contained within blank.sxi. To create a presentation, invoke produce_slides as follows:

perl produce_slides 3 chapter3.input

This results in a new Impress document called chapter3.sxi appearing on disk.

With the Impress files created, I needed to replace my graphic image placeholders with the actual image. The getcontent script extracted the image filename, however, not the actual image. Importing the images into Impress should have been straightforward, except that the originals I had were of pretty poor quality compared to those that made it into the book. The final images had been improved greatly during the publisher's final typesetting phase. And, of course, I didn't have the final image files.

Then I remembered that the publisher had sent final proof PDFs with all the high-quality graphic images in place. I used xpdf to view the proofs at 200% and then fired up The GIMP to screen-capture the xpdf display window. I then cut out the graphic image and saved it as a JPEG. It took a little while, but when finished I had a beautiful set of book-quality images to import into my Impress presentations. With this task complete, I exported the Impress document to PowerPoint format and the job was done. My initial estimate of 20 days of effort was reduced to about 20 hours of real work.

And now, of course, if I need to produce some slides quickly, I can create my textual content manually in vi, run it through the produce_slides script and I'm done.

Final Words

What started off as a seemingly impossible task—programmatically producing PowerPoint presentations—turned out to be quite possible, thanks to open source. All the tools I needed shipped out of the box with my stock Red Hat 9 distribution: vi, unzip, Perl, xmllint, xpdf, The GIMP and the OpenOffice.org suite.

Resources for this article: /article/8055.

Paul Barry (paul.barry@itcarlow.ie) lectures at the Institute of Technology, Carlow, in Ireland. Information on the courses he teaches, in addition to the books and articles he has written, can be found on his Web site, glasnost.itcarlow.ie/~barryp.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Great ideas, thanks!

Anonymous's picture

Great ideas, thanks!

getcontent script

Norberto's picture

I am not my self a perl programmer. A way to obtain a workable getcontent script?

Best

Missing getcontent script

barryp's picture

Sorry ... the script appears to be missing from the download. Here it is:

#! /usr/bin/perl -w

#
# The "getcontent" script: Given a LaTeX file on the command-line,
# extract it's textual content.
#
# By Paul Barry, paul.barry@itcarlow.ie
#

use strict;

use constant TRUE => 1;
use constant FALSE => 0;

my $in_verbatim = FALSE;
my $in_maxim = FALSE;
my $graphic_name = '';

while ( <> )
{
if ( $in_maxim )
{
if ( /\\end\{maxim\}/ )
{
print "STOPMAXIM\n";
$in_maxim = FALSE;
}
else
{
print;
}
next;
}

if ( $in_verbatim )
{
if ( /\\end\{verbatim\}/ || /\\end\{alltt\}/ )
{
print "STOPCODE\n";
$in_verbatim = FALSE;
}
else
{
print;
}
next;
}

if ( /\\chapter\{(.*)\}/ )
{
print "CHAPTERTITLE: $1\n"; next;
}

if ( /\\section\{(.*)\}/ )
{
print "BULLETTITLE: $1\n"; next;
}

if ( /\\subsection\{(.*)\}/ )
{
print "BULLETCONTENT: $1\n"; next;
}

if ( /\\begin\{verbatim\}/ || /\\begin\{alltt\}/ )
{
print "STARTCODE\n";
$in_verbatim = TRUE; next;
}

if ( /\\begin\{maxim\}/ )
{
print "STARTMAXIM\n";
$in_maxim = TRUE; next;
}

if ( /images\/(.*?)\}/ )
{
$graphic_name = $1; next;
}

if ( /\\caption\{\\label\{/ )
{
/label\{.*?\}(.*)\}\}/;
print "GRAPHICCAPTION: $1\n";
print "GRAPHICNAME: $graphic_name\n"; next;
}

if ( /^\\textit\{(.*)\}/ )
{
print "CHAPTERCONTENT: $1\n"; next;
}
}

Paul Barry

Some important updates to the OpenOffice::OODoc module

barryp's picture

Jean-Marie Gouarné contacted me via e-mail with some updates on the status of his excellent Perl module. Here's what he said:

Thanks for this article. It's very useful for evangelization about the OOo XML format... And (that is much less important) thanks for your test with my OpenOffice::OODoc module!

However, I've just 2 remarks about your quotation of this Perl module:

1) OpenOffice::OODoc *can* create new OOo files (texts, spreadsheets, presentations and drawings) from scratch; this feature is available since version 1.201 (2004-07-30). To do so, the ooDocument() constructor must be called with a create => $class option (where $class is the document class, i.e. "text", "spreadsheet", etc).

2) The module has notably evolved in the meantime; now it supports both the OpenOffice.org 1.0 and the OpenDocument formats; in addition, there are a few draw- or impress-focused methods (so, for example, such methods as insertDrawPage or appendDrawPage are available in order to organize and copy presentation slides). But you were right when you said that "the module was created with a view to working primarily with OpenOffice.org Writer files". Text documents were and remain the main target.

I thought it worthwhile to post his message here. Thanks.

--
Paul Barry
IT Carlow, Ireland
http://glasnost.itcarlow.ie/~barryp

Paul Barry

Writing to Impress from Perl

Michelle Chang's picture

Easy way to write to Impress/Powerpoint as Jean-Marie said:

#! /usr/bin/perl -w

use strict;
use OpenOffice::OODoc;

# start a new preso
my $preso = ooDocument(file => 'test.sxi', create => 'presentation');

my $slide = $preso->getDrawPage(0); # slide 0

$preso->createTextBox
(
attachment => $slide,
size => '10cm, 2cm',
position => '1cm, 2cm',
content => 'I want to write to Impress from Perl'
);

$preso->save;

Programmatic Conversions?

Jordan's picture

Thanks for the great article! Do you know if there is a way to programmatically convert the resulting impress document to PowerPoint? Perhaps as in $preso->export(...)?

Or would I need to use something like the Python-UNO bridge to do so?

image extraction from pdf's

girvim01's picture

Would "pdfimages" (part of the xpdf package) have sped up the final step (extraction of images from the pdf page proofs)?

using pdfimages?

barryp's picture

> sped up the final step?

maybe ... if I had known about it! :-)

Thanks - I'll check out pdfimages.

Paul.

Paul Barry

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState