Linux Print System at Cisco Systems, Inc.
Due to the growth of Cisco and the sheer amount of day-to-day print administration, I had allowed the engineers access to edit the master configuration file. However, this was creating a problem. Since the file was edited with vi, there was no locking, and I was beginning to have problems with people overwriting each other's changes. Also, the file was getting so big the mkprint program was taking a significant amount of time to run. I needed to put this configuration data into a database.
I wrote what I thought would be a “simple distributed database” (SDDB). I soon discovered the first two words are a contradiction in terms. It is actually more like a “network directory” than a database—it performs a function similar to Sun's NIS (Yellow Pages).
NIS maintains separate domains, each of which has a master server and multiple copy servers. While each NIS server can store the data for multiple domains, the data never merge. A client has to “bind”, or attach, to a particular domain on a particular server and can query data only in that domain.
SDDB also maintains separate domains, each with its own master server and copy servers. Each master server receives record updates for its own domain and propagates these changes to all the other servers across all domains. The data from each domain is merged on each server into a single contiguous database—the original domain being stored on each record. Thus, when a client queries the data, it does so across all domains.
The records are held as a “field=value” list of variable lengths. Only the values defined are stored in the record, and new values can be added at any time.
Indices are held in memory using a “red-black” tree algorithm. All creation and comparing is done by user-supplied functions, so the indices are very flexible. SDDB allows for multiple indices and can detect and reject duplicate entries, unlike NIS which allows only one index on each file or table.
The SDDB servers are completely stateless (i.e., they do not store any information between client requests) and use a fast UDP protocol to perform all transactions. A modification sequence number (which is analogous to a modification time) is held on each record so that a master server can decide what records have been updated and need to be propagated to the other servers. Since only the modified data is transferred, the propagation delay can be made very short—it is currently about 30 seconds.
SDDB has an API for both C and shell scripts. Thus, you can use either script to inquire, update or delete records in the database. The database is not tied to the print system—it can be used to store any sort of record-oriented data.
I installed SDDB on every print server and converted all the master configuration data into SDDB records. SDDB could now provide the configuration data for mkprint, which produced the configuration files (/etc/printcap, et al.). I re-wrote mkprint in C (it was originally written in shell and awk script), which improved its speed enourmously. It no longer had to use rcp, since the data was already present on the local server. I rewrote the web (CGI) programs so that they no longer relied on the output of mkprint, and received their information directly from SDDB.
I wrote a front-end for SDDB, called pradmin, designed for the print system. It uses a simple command-line interface, similar to the Cisco router interface. Now multiple users can update the database simultaneously without fear of clashing.
As more and more programs came to rely on SDDB and the data it contained, SDDB became the glue that tied all the print servers together. A single update would affect many servers, which would all act in unison. Every print server knew about every printer at Cisco and acted accordingly. The “Distributed Machine” had started to take shape.
Cisco started a spree of buying up small companies, particularly in the San Francisco Bay Area, so it was time to start installing more print servers. Linux machines were the best choice, since they are cheap. A Linux print server would cost under $2000 US, less than a third of its commercial rival, Sun's SPARC 5.
I ported and rewrote the remainder of the programs that, until then, had worked only on SunOS. Now the Linux machines could perform the full function of a print server.
A print server was installed a few miles down the road in Scotts Valley. Aside from a few teething problems, it worked. We then shipped one to Sydney, Australia. I preconfigured it with an IP address, so the only thing the system administrator in Sydney had to do was hook up the power and the network. It worked flawlessly. The SDDB server came up, copied its data down, I ran an mkprint and off it went.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Tech Tip: Really Simple HTTP Server with Python
- Non-Linux FOSS: Caffeine!
- Managing Linux Using Puppet
- Google's SwiftShader Released
- Doing for User Space What We Did for Kernel Space
- SuperTuxKart 0.9.2 Released
- Parsing an RSS News Feed with a Bash Script
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide