A guide to using PDFs on GNU/Linux

Although GNU/Linux has long supported postscript format, full support for the related PDF file format has been longer in arriving. Today, however, PDF support is finally starting to equal what is available on other operating systems. Whether you are printing, editing, or viewing PDF files, you now have the choice of a variety of applications on both the command line and the desktops.

What follows is not an exhaustive list of choices, but a survey of the main tools available. Taken together, they should be enough to fill most of your PDF needs.

Printing to PDF

GNU/Linux offers several options for producing .PDF files. At the command line, you can use ps2pdf, a script that comes bundled with Ghostscript. As its name suggests, ps2pdf converts postscript files to .PDF format. You can convert a file to postscript within any application by setting up a postscript printer to print to file (you don't actually need the physical printer). From there, all you need is to enter the command ps2pdf <input.ps> <output.pdf>. If you don't want to change the path or name of the output file, you can omit it altogether to produce a file that has the same name as the postscript input file, but with a .pdf extension.

On the desktop, you can use Print to File (PDF) for KDE-aware applications like KWord, selecting it as a printer from within the application. A more universally useful solution is the CUPS-PDF driver, which you can set up as a CUPS printer. For CUPS-PDF, use the device URI of cups-pdf:/, and select Generic from the list of manufacturers and Postscript from the list of models. All output from CUPS-PDF is to a sub-folder of your home directory with the same name as the printer you set up.

However, all three of these solutions fall far short of the full functionality available for other operating systems.If you want more control over the PDF files you produce, a better choice is OpenOffice.org's File -> Export to PDF, especially when the extendedPDF add-on is installed. This option gives you control over image compression, bookmarks, forms, the default view, and security options -- just about everything except font embedding, which is enabled by default (an option that simplifies the sharing of files between different computing systems). The main drawback is that the combination produces PDF files that are, on average, about one-third larger than those produced by CUPS-PDF. Otherwise, the degree of control it allows should make it the preferred method of PDF production.

Editing PDFs

When you have an existing PDF file that you need to edit, the quickest solution on any platform is to make your changes in the source file and produce a new file. However, if you don't have the source file, you have at least two options.

From the command line, pdftk offers a variety of functions. Using pdftk, you can split, merge, decrypt, encrypt PDF files, as well as fill out forms, rotate pages, add watermarks, or edit metadata. In some cases, you can even repair a corrupted file. The one drawback to pdftk is that it involves editing in text mode what is basically a graphical document. For this reason, editing with pdftk can often be a challenge, even if you have a copy of the file that you are editing open in a viewer.

For many, a more practical solution is PDFedit. Although still in early development, PDFedit has many of the functions of pdftk. Even more importantly, it allows you to edit individual characters. This ability is painfully slow, and limited to a buffer of about fifty characters. Nor is it up to editing more than a line at a time, which would require massive reformatting of the document. Still, Adobe Acrobat itself offers no better character editing of PDFs, and, even in its current release, PDFedit is a welcome addition to the PDF tool kit. Between it and pdftk, GNU/Linux users should be able to achieve approximate parity with users of other operating systems, at least in terms of functionality -- convenience,unfortunately, remains elusive.

Viewing PDFs

PDF viewers are easier to implement than PDF generators or editors, so GNU/Linux has a number of them. For those with no objections to use non-free software, Adobe has released Adobe Acrobat Reader for Linux 7.0, which is on a par with its readers for other operating systems, and allows viewers to see both bookmarks and metadata.

The available free viewers are a mixed lot. Both gv and xpdf display the pages of a PDF document, and include the standard options for changing the magnification and for navigating documents. However, neither shows thumbnails, bookmarks, or metadata. Since both GNOME's Evince and KDE's KPDF can display all three, either one of them is generally a better choice for a free view -- about nine times out of ten. However, every now and then, gv or xpdf manages to read PDF files that Evince or KDPF cannot. Whatever the reason for this difference in functionality, it means that you should at least keep the full range of choices in the back of your mind, even if you do not want to install every free viewer on the off-chance that you need it.

Conclusion

PDF support on GNU/Linux still has some gaps. Still, it has improved immensely from even a few years ago, until today it is rapidly approaching the functionality found on other platforms. Most of the time, the challenge now is not finding the tools you need so much as switching between a variety of shell and desktop interfaces in your search. A single suite of programs like Adobe Acrobat with a more or less common interface would improve the situation immensely for the average user.

However, that is a challenge for another day. For now, what matters is that, with a little bit of flexibility and ingenuity, you can consider one of the long-standing gaps in GNU/Linux functionality close to be bridged.


Bruce Byfield is a computer journalist who writes regularly for NewsForge, Linux.com, and IT Manager's Journal.

______________________

--
Bruce Byfield (nanday)

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Still a ways to go

dave shields's picture

I'm working on a series of posts on the ODF and OOXML specifications. They are each available on the web, and each is provided in two formats: its own and PDF. The only format common to both is PDF.

So I had to work with PDF files on Ubuntu. I found this experience so frustrating I wrote a post about it, PDF: A Portable, Persnickety, Problematic, and Proprietary Document Format

Though Linux PDF support has improved there is still much to be done.

thanks, dave

balloon-like comments for protected pdf files

Iggy's picture

I haven't found any free software yet that is able to add balloon-like comments to protected pdf files, (you can do it with Adobe Acrobat or Nitro PDF under Win) which I need to do when I get proof from the publisher.

Does anybody know something that can be used for this?

BTW, This is one of currently very few reasons I still need a Win partition on my HD.

Editing PDF files on Linux

David Breakey's picture

While it's not a perfect solution (not open source, or otherwise free), an option to investigate if you need to be able to edit an existing PDF file is Pagestream.

I haven't messed around with this particular feature an awful lot, but Pagestream is a commercial DTP package (currently $150 for the "Pro" edition) capable of "opening" a PDF document, and converting it into an editable form (note, I haven't used this for a while, so I don't know if it is still in the current version; also, I believe this is a feature of the "Pro" edition).

Granted, this means it is turned into a Pagestream document, but if you no longer have access to the original source document, that could be a lifesaver.

While I've had no problems opening a PDF document, saving it as a PDF document again can be … interesting, mostly due to how Pagestream handles font mapping.

And yes, Pagestream is available for Linux; it has roughly the same software requirements as GNOME (essentially, a working GTK+ installation).

And before anyone wonders, no I don't work for them; I'm just a satisfied user of the program who feels they deserve the exposure.

PDF Reader

Krendoshazin's picture

For rendering PDFs I would recommend to use epdfview, it's a standalone GTK PDF reader that uses poppler for rendering. Poppler is based on xpdf-3.0 code, so it has all the features that xpdf provides while being nicely rendered through GTK (GUI, fonts, etc).

Yeah. epdfview works like a

Bjorn Solstad's picture

Yeah. epdfview works like a charm :)

Editing PDFs with KWord

Anonymous's picture

Sometimes the most convenient way to edit a PDF is to import it into KWord and make the changes there. It does a good job on many PDF forms that you would otherwise have to print and fill in by hand.

KWord's not bad, but of course not perfect

Anonymous's picture

If you need something to edit PDF documents in a crunch, yep, KWord's PDF importer will often do the trick. The caveat is that it needs to be a pretty simple PDF document; KWord doesn't seem to translate the complex formatting of many PDF documents all that well. That said, I'm glad it's available.

PDF, a de-facto and truly open standard (unlike MS's Uh-Oh-XML), is great for preserving exact formatting, but it never has lent itself well to editing existing documents. That's because it was never intended for that purpose. If you need to edit documents with a truly open standard, well, that's what OpenDocument is for. Thus, we should be pushing OpenOffice.org, KOffice, StarOffice, and anything else that supports OpenDocument *natively*.

comments

Anonymous's picture

A nice feature of adobe acrobat is the ability to mark-up a document with comments. I haven't seen much of the free-software front that attempts this. Is this hard to implement or is it just one of those rarely used features?

For annotations

Anonymous's picture

Try flpsed, which was mentioned earlier by someone else in the comments here.

Another useful pdf tool not

Anonymous's picture

Another useful pdf tool not mentioned in the article is mbtPDFasm. I've used it to combine a series of pdfs into one pdf file and to insert page numbers into the resulting file. It's a command line tool, so there is some syntax to learn.

What am I missing? I'm

Carlos Moreno's picture

What am I missing? I'm quite shocked to see an article on
PDF support on Linux that fails to mention... let's see...
oh, Adobe's Acrobat Reader for Linux !!!!

I know, it's not Open Source, and thus not strictly *free*
software, but it is at least *gratis*, and, above anything
else, it *is* a PDF reader that runs on Linux and is supported
by Adobe...

Am I the only one that finds it inexplicable that such an
article fails to mention such an obvious and such an important
detail?

Carlos
--

Hehe. I noticed that as

Bjorn Solstad's picture

Hehe. I noticed that as well.. Of cource Adobe's Acrobat Reader for Linux should have been mentioned ;)

Yes, you're the only one who

Anonymous's picture

Yes, you're the only one who finds it inexplicable: because the articel *does* mention Adobe Acrobat.

Acrobat reader supports DRM,

Anonymous's picture

Acrobat reader supports DRM, so it should be excluded by disqualification since it's software that works against your best interests.

Advocating the usage of Acrobat Reader is like advocating the use of guns with a second barrel pointing back to you.

Huh?

peter.green's picture

What am I missing?

Well, you're missing the first paragraph in the section titled "Viewing PDFs", which reads:

PDF viewers are easier to implement than PDF generators or editors, so GNU/Linux has a number of them. For those with no objections to use non-free software, Adobe has released Adobe Acrobat Reader for Linux 7.0, which is on a par with its readers for other operating systems, and allows viewers to see both bookmarks and metadata.

Acrobat reader for Linux is

NIKKELS's picture

Acrobat reader for Linux is in my opinion far behind acrobat reader for windows.
Acrobat reader for linux is NOT capable to play embedded music or speach from within its files.
Acrobat reader for windows does it perfectly well.
Further more, acrobat reader for linus is not even capable to display these files. The text is rendered USELESS.This situation is already like that for 18 months.
Adobe is aware of it and doesn't care a shit.

Pictures are available, so is the file...( to adobe developers )

Size of PDF Output

Anonymous's picture

I'm more concerned about the size of PDF output. I maintain a website where I upload PDF documents. Using PDFCreator on Windows, I routinely generate PDF documents of 8-10KB. This is really nice of PDFCreator to produce such small files.

Using OpenOffice's "Export to PDF" on Windows, the PDF files, with the same original documents, are about 80-90KB.

Using OpenOffice or CUPS-PDF on Linux, the PDF files are 300+KB.

I'd rather be Windows-free on my home machines, but until I can generate 8-10KB PDF files on Linux, I need to use PDFCreator on Windows.

Handy hint: You can run

Xanni's picture

Handy hint: You can run pdf2ps on existing PDF files and it will change embedded fonts to embedded font subsets, usually reducing the file size substantially.

D'oh! That should of course

Xanni's picture

D'oh! That should of course be ps2pdf.

Being able to read the document is more important

Anonymous's picture

Being able to read the document is more important than size. The only way you will have reproducible document fidelity is by embedding the fonts used when you create the PDF. It is not just a matter of having the same fonts on screen or in print, but also preserving the original layout of the document as you, the author, intended.

You cannot presume that every GNU/Linux user out there has installed copies of every single commercial font that MS Corp has bundled with its operating systems and software products and that goes for users of MS Windows as well. Not even Times New Roman, Arial and Courier New because there are different versions, with different complements of character sets.

Most PDFs out there without embedding and subsetting the original fonts (to make output size smaller) look plain horrible in every other machine except the one of the person who produced the document. I, for one, have replaced my copies of Times New Roman, Arial and Courier New with RedHat's Liberation fonts everywhere.

Printing A4 to Letter

Jordan Wilberding's picture

Does anyone know how to have cups print A4 pdfs as letter sized? Our printer sometimes complains that the pdfs are in A4 format, so we have to keep press ok when it asks if we would rather print it as a letter sized. It would be nice to avoid this.

A4 to letter

johnrobert's picture

Hi Jordan. I've had the same problem, so after I read your comment I went looking and have come up with something I'm going to try. Apparently xpdf has a pdftops tool that will convert your A4 pdf to postscript. Then, I'm going to try to resize the pages of the postscript file using a tool called psresize that I read about on this page: http://www.dsl.org/cookbook/cookbook_25.html.

If that works, I may try to package it into a bash script.

Good luck!

Use the "fitplot" commandline option

Kurt W. Pfeifle's picture

"Does anyone know how to have cups print A4 pdfs as letter sized?"

I do.

"Our printer sometimes complains that the pdfs are in A4 format, so we have to keep press ok when it asks if we would rather print it as a letter sized. It would be nice to avoid this."

Use this command:

lp -d your_printer_name -o PageSize=Letter -o fitplot=true /path/to/printfile.pdf

"PageSize=Letter" selects the output media format, "fitplot" tells CUPS to scale the input file from whatever size it is to "Letter".

You can save these options permanently into "~/.lpoptions" (for CUPS 1.1.x) and/or "~/.cups/lpoptions" (for CUPS 1.2.x) by editing a line into that file like this:

your_printer_name fitplot=true PageSize=A4

The following command will also create that line for you:

lpoptions -o fitplot=true -o PageSize=A4 -p your_printer_name

Once that line is in your (.)lpoptions file, these options will become your printing defaults and be automatically applied unless specified otherwise.

pdf editing

Anonymous's picture

for editing pdf there is also flpsed : you can write over an exisiting pdf document (like a type writer). Works great for filling forms.

KPDF would be nearly perfect if...

Tux Runner's picture

KPDF lacks one seriously-needed feature: the ability to rotate PDFs. Or, does it do this already and I'm just unaware? XPDF does this already.

xpdf can rotate

Destructor's picture

yeah, xpdf can rotate the pdf files, and the xpdf.utils can convert them to .txt or .ps files.

kpdf --> okular in KDE4

Anonymous's picture

KDE4 will have a PDF viewer (okular, which in reality is a multi-format viewer) that is able to do that, and much more.

Great

Peter's picture

Great

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix