Linux Print System at Cisco Systems, Inc.

Cisco runs a redundant system of 50 print servers using Linux, Samba and Netatalk. It prints to approximately 1,600 printers worldwide, serving 10,000 UNIX and Windows 95 users, some of whom are in mission-critical environments.
Simple Distributed Database (SDDB)

Due to the growth of Cisco and the sheer amount of day-to-day print administration, I had allowed the engineers access to edit the master configuration file. However, this was creating a problem. Since the file was edited with vi, there was no locking, and I was beginning to have problems with people overwriting each other's changes. Also, the file was getting so big the mkprint program was taking a significant amount of time to run. I needed to put this configuration data into a database.

I wrote what I thought would be a “simple distributed database” (SDDB). I soon discovered the first two words are a contradiction in terms. It is actually more like a “network directory” than a database—it performs a function similar to Sun's NIS (Yellow Pages).

NIS maintains separate domains, each of which has a master server and multiple copy servers. While each NIS server can store the data for multiple domains, the data never merge. A client has to “bind”, or attach, to a particular domain on a particular server and can query data only in that domain.

SDDB also maintains separate domains, each with its own master server and copy servers. Each master server receives record updates for its own domain and propagates these changes to all the other servers across all domains. The data from each domain is merged on each server into a single contiguous database—the original domain being stored on each record. Thus, when a client queries the data, it does so across all domains.

The records are held as a “field=value” list of variable lengths. Only the values defined are stored in the record, and new values can be added at any time.

Indices are held in memory using a “red-black” tree algorithm. All creation and comparing is done by user-supplied functions, so the indices are very flexible. SDDB allows for multiple indices and can detect and reject duplicate entries, unlike NIS which allows only one index on each file or table.

The SDDB servers are completely stateless (i.e., they do not store any information between client requests) and use a fast UDP protocol to perform all transactions. A modification sequence number (which is analogous to a modification time) is held on each record so that a master server can decide what records have been updated and need to be propagated to the other servers. Since only the modified data is transferred, the propagation delay can be made very short—it is currently about 30 seconds.

SDDB has an API for both C and shell scripts. Thus, you can use either script to inquire, update or delete records in the database. The database is not tied to the print system—it can be used to store any sort of record-oriented data.

Effect of SDDB

I installed SDDB on every print server and converted all the master configuration data into SDDB records. SDDB could now provide the configuration data for mkprint, which produced the configuration files (/etc/printcap, et al.). I re-wrote mkprint in C (it was originally written in shell and awk script), which improved its speed enourmously. It no longer had to use rcp, since the data was already present on the local server. I rewrote the web (CGI) programs so that they no longer relied on the output of mkprint, and received their information directly from SDDB.

I wrote a front-end for SDDB, called pradmin, designed for the print system. It uses a simple command-line interface, similar to the Cisco router interface. Now multiple users can update the database simultaneously without fear of clashing.

As more and more programs came to rely on SDDB and the data it contained, SDDB became the glue that tied all the print servers together. A single update would affect many servers, which would all act in unison. Every print server knew about every printer at Cisco and acted accordingly. The “Distributed Machine” had started to take shape.

Linux Goes into the Field

Cisco started a spree of buying up small companies, particularly in the San Francisco Bay Area, so it was time to start installing more print servers. Linux machines were the best choice, since they are cheap. A Linux print server would cost under $2000 US, less than a third of its commercial rival, Sun's SPARC 5.

I ported and rewrote the remainder of the programs that, until then, had worked only on SunOS. Now the Linux machines could perform the full function of a print server.

A print server was installed a few miles down the road in Scotts Valley. Aside from a few teething problems, it worked. We then shipped one to Sydney, Australia. I preconfigured it with an IP address, so the only thing the system administrator in Sydney had to do was hook up the power and the network. It worked flawlessly. The SDDB server came up, copied its data down, I ran an mkprint and off it went.


Geek Guide
The DevOps Toolbox

Tools and Technologies for Scale and Reliability
by Linux Journal Editor Bill Childers

Get your free copy today

Sponsored by IBM

8 Signs You're Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
On Demand
Moderated by Linux Journal Contributor Mike Diehl

Sign up now

Sponsored by Skybot