Building the Perfect Box: How To Design Your Linux Workstation

 in
This article is a guide to building capable Linux workstations from cheap generic PC hardware.

Most of the good things about Linux flow from the fact that it makes a full-featured Unix accessible on inexpensive hardware. Accordingly, there's a huge amount of documentation and folk knowledge in the Linux community about how to get people who already have cheap hardware to use Linux on it. Up to now there hasn't been much advice available on how to acquire cheap hardware that is well-matched to Linux, for someone who already knows Linux.

At today's prices, it's possible to put together a terrific personal Unix platform for less than $2,000 US. If you're prepared to go mail-order, shop carefully and make a few minor tradeoffs, you can do it for $1,500 or even less. But beware. If you buy as though for a DOS/Windows box, you won't get the best value or performance. Linux works its hardware harder than Unix does, and configurations that are marginal under DOS/Windows can cause problems under Linux.

In this article, we'll develop a recipe for a cheap but capable Linux workstation. While developing it, we'll discuss the recipe choices in some detail, and see how to avoid common pitfalls that can cause you grief.

We are going to stick to Intel hardware in this article. Alphas are fast and have that wonderful 64-bit architecture, and SPARCs too have earned their fans. However, I think PC hardware is still overall the most cost-effective—cheapest to buy, easiest to get serviced and best-tested with Linux. And, given the relative sizes of the respective markets, PC hardware seems likely to hold its lead for years yet.

For more detail on this subject, organized in a reference rather than narrative format, surf to my PC-Clone UNIX Hardware Buyer's Guide at http://www.ccil.org/~esr/clone-hw-guide/contents.html. I've been maintaining this document and its FAQ ancestor for longer than Linux has existed, and have been running Unix on PC hardware since shortly after it first became possible in the late 1980s.

What To Optimize

Most people think of the processor as the most important choice in specifying any kind of personal-computer system. Our first lesson in building Linux boxes is this: for Linux, the processor type is nearly a red herring. It's far more important to specify a capable system bus and disk I/O subsystem.

One important reason for this is precisely because PC systems are marketed in a way that presents processor speed as a primary figure of merit. The result is that the development of processor technology has naturally gotten pushed harder than anything else, and off-the-shelf PCs have processors that are quite overpowered relative to the speed of everything else in the system. Your typical PC these days has spare CPU-seconds it will never use, because the screen, disk, modem and other peripherals can't be driven fast enough to tax it.

If you're already running Linux, you may find it enlightening to keep top(1) running for a while as you use your machine. Notice how seldom the CPU idle percentage drops below 90%.

It's true that after people upgrade their motherboards, they often report a throughput increase. But this is often due more to other changes that go with the processor upgrade, such as improved cache memory or an increase in the system bus's clocking speed, i.e., enabling data to get in and out of the processor faster.

The unbalanced, processor-heavy architecture of PCs is hard to notice under DOS and Windows 3.1, because neither OS hits the disk very much. But any OS that uses virtual memory and keeps lots of on-disk logs and other transaction states is a different matter—it will load the disk more heavily and will suffer more from the imbalance.

Linux is in this category, and I'd guess Windows NT and OS/2 are too. Assuming you're buying for Linux on a fixed budget, it makes sense to trade away some excess processor clocks to get a faster bus and disk subsystem.

The truth is that any true 32-bit processor now on the market is more than fast enough for your disks under a typical Linux-like load, even if it's a lowly 386/25. Your screen, if you're running X, can be a bit more demanding—but even a 486/50 will let you drag Xterm windows around like paper. And that's a lot slower than the cheapest new desktop machine you'll be able to find by the time this article hits paper.

So buy a fast bus. And especially, buy fast disks. How does this translate into a recipe? Like this:

  • Don't bother with the latest Pentium Pro whizbang 300mHz super-scorcher with a cooling fan bigger than it is.

  • Do get a PCI-bus machine.

  • Do get a SCSI controller.

  • Do get the fastest SCSI disks you can afford.

Buying PCI will get you maximum bus throughput, and makes sense from several other angles as well. The doggy old ISA bus is clearly headed for extinction at this point, and you don't hear much about its other competitors (EISA, VESA local-bus video or MCA) anymore. With PCI now being used in Macintoshes and Alphas as well as all high-end Intel boxes, it's clearly here to stay, and a good way to protect your investment in I/O cards from rapid obsolescence.

The case for SCSI is a little less obvious, but is still compelling. For starters, SCSI is still at least 10-15% faster than EIDE running flat out. Furthermore, EIDE is still something of a “jerry-rig”. Like Windows, it's layered over an ancestral design (ST-512) that's antiquated and prone to failure under stress. SCSI, on the other hand, was designed from the beginning to scale up well to high-speed, high-throughput systems. Because it's perceived as a “professional” choice, SCSI peripherals are generally better engineered than EIDE equivalents. You'll pay a few dollars more, but for Linux the cost is well repaid in increased throughput and reliability.

For the fastest disks you can find, pay close attention to seek and latency times. The former is an upper bound on the time required to seek to any track; the latter is the maximum time required for any sector on a track to come under the heads, and is a function of the disk's rotation speed.

Of these, seek time is more important and is the one manufacturers usually quote. When you're running Linux, a one millisecond faster seek time can make a substantial difference in system throughput. Back when PC processors were slow enough for the comparison to be possible (and I was running System V Unix), it was easily worth as much as a 30mHz increment in processor speed. Today the corresponding figure would be higher.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix