Overview of Linux Printing Systems
This article presents a brief overview of the main printing systems in use on most Linux systems, with an introduction to the concepts and procedures at the core of UNIX printing. We will finish by approaching the future of Linux printing, and how it is quickly improving.
It is important to understand that printing in the Unix world revolves almost entirely around the PostScript page description language, developed by Adobe Corp. as a full-fledged programming language used to describe the contents of each page of a document. Many printers nowadays have an embedded PostScript interpreter, which is in charge of rendering the pages to paper using their PostScript description. All modern Linux desktop applications that have a print option will produce PostScript data to print full-page documents.
This approach is widely different from other desktop-oriented operating systems, and from it stems most of the problems that made Unix printing such a daunting task. Operating systems like Windows or MacOS have much more tightly integrated APIs made available to applications, often exposing the capabilities of the printers and providing an abstraction layer so that applications don't have to worry about device-specific details. Moreover, the printing API is usually integrated with the graphics API used for displaying on the screen, something that has yet to happen with X11.
On most Unix systems, the only available interface pretty much boils down to "submit a job to a queue and hope that it prints correctly". There is no unified way of gathering printer or job status, which seriously impairs the possibilities offered by Linux applications with regards to printing.
While PostScript is the de facto standard for producing documents to be printed on Linux, the printer itself doesn't have to understand PostScript, which stays a relatively expensive technology. In many cases, especially with lower-end printers, the PostScript data will have to be translated to the native page description language of the printer. This is done through the use of a special conversion filter. Generally speaking, a filter is a special program that will process its input and produce processed data on its output. There are different types of filters that are used in the context of Linux printing : conversion filters, I/O filters (responsible for transferring data to the device), processing filters (that transform the document data).
The basis of a printing system is the spooler. The spooler manages queues of print jobs. A queue is usually associated with a single printer, and jobs submitted by users are processed on a first come, first serve basis. When a job gets to be processed, its data is usually passed through a certain number of filters before it gets to the printer itself. UNIX print spoolers come in many different forms. We will focus here on the most popular variants that are widely present in most Linux distributions.
As its name implies, this print system spawned from the Berkeley distribution of UNIX. The Line Printer Daemon (LPD) is still the basis for many other printing systems and spoolers that borrowed its interface and configuration file format, the printcap files. While LPD was initially developed for use with line printers that could only print a line of text at a time, it can be used for full page printers as well.
This was the printing system that made it in the first complete Linux distributions, like the early versions of Slackware. Nowadays, many distributions still ship this print spooler (Debian, Slackware), often alongside other more modern print systems like the other ones discussed in this article. There are many variants of the original BSD spooler still in use today.
The BSD printing system is really just a spooler - that is, its core functionality is limited to queuing jobs. It consists of a daemon (lpd), a couple of configuration files in /etc where queues and their properties are defined, a spooling directory where pending jobs will be held (usually /var/spool/lpd), and a set of basic commands to submit, delete and manipulate jobs (lpq, lprm, lpc).
Queues are defined in the /etc/printcap file, which follows the same format as termcap files, used to describe the capabilities of UNIX terminals. A typical printer entry would look like this :
# Sample queue definition for BSD LPD lp|printer1: :sd=/var/spool/lpd/lp: :lp=/dev/lp0: :if=/usr/sbin/somefilter: :mx#0: :sh:
Each entry defines a queue. There can be several queues referring to the same physical printer (for instance to distinguish certain options). A queue can also have several aliases. In the example above, the queue lp has an alias 'printer1'. Jobs can be sent to either of these printer names, and will be dropped in the same queue. As a side note, 'lp' is usually considered the default queue in the BSD world.
Jobs are submitted to the spooler via the lpr command. A specific queue can be specified with the -P argument. For instance :
lpr -Pprinter1 /path/to/some/file
Jobs that have been submitted but have not yet been processed can be removed from the queue, using the lprm command. The job ID number, as well as various status information, can be obtained by running the lpq command.
BSD LPR is significant because it also defined the LPD network protocol, which is used to submit jobs to remote LPD daemons, and allows UNIX workstations to function as print servers. This protocol is nowadays natively supported by virtually all networked printers. Because of its widespread usage, all other printing systems have had the requirement to at least be able to talk to other LPD daemons and thus implement this protocol.
Here is an example of how to define a remote queue in a printcap file. The jobs will be immediately transferred to the remote queue on the remote LPD daemon, and won't be processed on the original host.
# Sample queue definition for a remote LPD queue on a client remote: :sd=/var/spool/lpd/remote: :rm=printserver.domain.tld: :rp=queue: :mx#0:
The rm attribute indicates the address of the remote LPD server. The rp attribute is the name of the queue on this server where jobs will be sent.
The /etc/lpd.hosts file is used to define which hosts are allowed to forward jobs to the local LPD daemon.
The LPD protocol sends data in two different pieces. First, a control file describing the job will be constructed and sent. This control file includes information about the originating user, the name of the files, and any options attached to the job. Then, the data file follows - it is the document itself and its format is entirely dependent on the printing language in use at the time.
Practical books for the most technical people on the planet. Newly available books include:
- Agile Product Development by Ted Schmidt
- Improve Business Processes with an Enterprise Job Scheduler by Mike Diehl
- Finding Your Way: Mapping Your Network to Improve Manageability by Bill Childers
- DIY Commerce Site by Reven Lerner
Plus many more.
- Server Hardening
- Unikernels, Docker, and Why You Should Care
- diff -u: What's New in Kernel Development
- Controversy at the Linux Foundation
- 22 Years of Linux Journal on One DVD - Now Available
- Non-Linux FOSS: Snk
- Giving Silos Their Due
- Don't Burn Your Android Yet
- What's New in 3D Printing, Part III: the Software