Building the Ultimate Linux Box
Five years ago, I wrote an article for Linux Journal that developed a recipe for an elegant and economical Linux box. The article became one of the most popular in LJ's history, so the editors have invited me back for a second round.
This time around LJ recruited Rick Moen, author of some well-known FAQs on modems and other hardware topics, to assist. Daryll Strauss, the man behind the famous Linux renderfarm used in the movie Titanic, contributed sage advice from his background in graphics and extreme data crunching. Also, instead of going for economy we're going to go for maximum crunching power. We're going to ask how to get the absolute highest performance out of hardware we can live with.
Hardware you can live with means a machine that is stable, easy to troubleshoot and inexpensive to maintain. It should be small and low-maintenance enough to live beside your desk, not some liquid-cooled monstrosity. It should be, in short, a PC—a gold-plated hot rod of a PC but a PC nevertheless. Other important aspects of livability are the levels of emitted acoustic noise and heat; we'll be minimizing both.
We'll stick with PC hardware. Alphas are fast and have that wonderful, sexy 64-bit architecture, but the line seems all too likely to be killed in favor of the Itanium before long. Considered in isolation, I like the PowerPC chip a lot better than any x86 architecture. But PC hardware has all the advantages of being the biggest market; it's the easiest to get serviced and least expensive to upgrade.
The Ultimate Linux Box that we showcase will, of course, fall behind the leading edge within months (or even by press time). But walking through the process of developing the ULB will teach you things about system design and troubleshooting that you can continue to apply long after the hardware in this article has become obsolete.
For typical job loads under Linux, the processor type is nearly a red herring—it's far more important to specify a capable system bus and disk I/O subsystem. If you don't believe this, you may find it enlightening to keep top(1) running for a while as you use your machine. Notice how seldom the CPU idle percentage drops below 90%.
If you're building a ULB, go ahead and buy the fastest available processor. Once you've gotten past that gearhead desire for big numbers, pay careful attention to bus speeds and your disk subsystem because that's where you'll achieve serious performance wins. The gap between processor speed and I/O subsystem throughput has only widened in the last five years, so this is even better advice than it was in 1996.
Everybody's Doing Dual Athlons
How does all this translate into a recipe in 2001? Get a PCI-bus machine, not a hybrid PCI/ISA design; you sacrifice about 10% of peak performance with those. Get the fastest available front-side (processor-to-memory) bus (in August 2001, the maximum is 266MHz). Get a high-speed SCSI controller and the fastest SCSI disks you can get your hands on.
The case for SCSI is a little less obvious but still compelling. For starters, SCSI is still at least 10%-15% faster than IDE/ATAPI running flat out. Because it's perceived as a professional choice, SCSI peripherals are generally better engineered than IDE/ATAPI equivalents, and new high-performing drive technologies tend to become available in SCSI first. You'll pay a few dollars more, but the cost is well repaid in increased throughput and reliability. Rick Moen comments:
They call me a SCSI bigot. So be it—but my hardware keeps being future-proof: I don't have to run bizarre emulation layers to address CDRs, I never run low on IRQs or resort to IRQ-sharing, all my hard drives have hardware-level hot-fix, all my hard disk/CD/tape/etc., devices support a stable standard rather than this month's cheap extension kludge, and I don't have to worry about adverse interactions at the hardware or driver levels from mixing ATA and SCSI.
Neither Daryll nor I will have IDE in any machine we build either.
To pick the fastest disks, pay close attention to average seek and latency time. The former is an average time required to seek to any track; the latter is the maximum time required for any sector on a track to come under the heads.
Of these, average seek time is much more important. When you're running Linux, a one millisecond faster seek time can make a substantial difference in system throughput. The manufacturers themselves avoid running up seek time on larger-capacity drives by stacking platters vertically rather than increasing platter size. Thus, seek time, which is proportional to the platter radius and head-motion speed, tends to be constant across different capacities in the same product line. This is good because it means you don't have to worry about a capacity vs. speed trade-off.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- Designing Electronics with Linux
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Why Python?
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Validate an E-Mail Address with PHP, the Right Way
- What's the tweeting protocol?
- Tech Tip: Really Simple HTTP Server with Python
- General
2 hours 15 min ago - Kernel Problem
12 hours 18 min ago - BASH script to log IPs on public web server
16 hours 45 min ago - DynDNS
20 hours 21 min ago - Reply to comment | Linux Journal
20 hours 53 min ago - All the articles you talked
23 hours 17 min ago - All the articles you talked
23 hours 20 min ago - All the articles you talked
23 hours 21 min ago - myip
1 day 3 hours ago - Keeping track of IP address
1 day 5 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
Re: Building the Ultimate Linux Box
Hi Eric - Nice article ...
Shame about the mother board though... it is apparently JUNK! If you are thinking of parting with your hard earned cash in exchange for a Tyan Thunder K7 MB you would be well advised to read ALL of this!
The next couple of paragraphs will give an insite as to why and the link(s) that follows will reveal the whole sorry mess....
TAKEN FROM:- http://www.hardwareanalysis.com/content/article/1325/
We have a bit of a problem here. Before reading any further, read this thread here http://discussions.hardwarecentral.com/Forum2/HTML/010991.html, and this one here http://www.amdforums.com/showthread.php?s=e85326da9d32b72654e6d48a05f1e2...
In short, matters with the Tyan Thunder K7 are not as rosy as can be. In fact, it seems like they
Re: Building the Ultimate Linux Box
Well, yes the mother board suggested seems to be barely worth it's weight in packaging bubbles...So I still want to build an ULB---> but I'm I'm really not on top of my hardware info...Any replacement motherboards to suggest?
Re: Building the Ultimate Linux Box
I am puzzeled by your choice of components. First the CPU's and MB, AMD processors are better space heaters then CPU's, Xeon's run much cooler and use a 400MHZ FSB. Next the MB, why an integrated SCSI controller? I would use a MB with 64 bit PCI slots and 29160 or 39160 controller that could upgrade to a U320 controller when available. Next the hard drives, I would use Seagate Cheetah X15 drives, 3.6ms access time, U320 standard now, better throughput and above all faster warranty turn around. The MB would use a 860 chipset to avoid compatability issues like video timing. I dislike trouble shooting and resolving problems that should not occur. I realize that I may have angered some AMD bigots but I am a pragmatist and have fewer problems with Intel, so it is the path of least resistance. For those that wish to argue benchmark performance everthing come to a screeching halt when you need to resolve compatability problems, the score is 0 when your machine is down.
Re: Building the Ultimate Linux Box
Thank you for a well written article. Being technically disabled there is a part I don't understand.
If there is a floppy-there is ide yes/no?
If yes is there not already the 10% hit ?
Quess I would have saved a few pennies taking non scsi cd-rw and dvd-rom and a possible hit here and using both scsi channels for the hard drives with the backup chained to one. Like I said technically disabled but favor daily improvement over hit for occasional cd write.