Building the Ultimate Linux Box
Five years ago, I wrote an article for Linux Journal that developed a recipe for an elegant and economical Linux box. The article became one of the most popular in LJ's history, so the editors have invited me back for a second round.
This time around LJ recruited Rick Moen, author of some well-known FAQs on modems and other hardware topics, to assist. Daryll Strauss, the man behind the famous Linux renderfarm used in the movie Titanic, contributed sage advice from his background in graphics and extreme data crunching. Also, instead of going for economy we're going to go for maximum crunching power. We're going to ask how to get the absolute highest performance out of hardware we can live with.
Hardware you can live with means a machine that is stable, easy to troubleshoot and inexpensive to maintain. It should be small and low-maintenance enough to live beside your desk, not some liquid-cooled monstrosity. It should be, in short, a PC—a gold-plated hot rod of a PC but a PC nevertheless. Other important aspects of livability are the levels of emitted acoustic noise and heat; we'll be minimizing both.
We'll stick with PC hardware. Alphas are fast and have that wonderful, sexy 64-bit architecture, but the line seems all too likely to be killed in favor of the Itanium before long. Considered in isolation, I like the PowerPC chip a lot better than any x86 architecture. But PC hardware has all the advantages of being the biggest market; it's the easiest to get serviced and least expensive to upgrade.
The Ultimate Linux Box that we showcase will, of course, fall behind the leading edge within months (or even by press time). But walking through the process of developing the ULB will teach you things about system design and troubleshooting that you can continue to apply long after the hardware in this article has become obsolete.
For typical job loads under Linux, the processor type is nearly a red herring—it's far more important to specify a capable system bus and disk I/O subsystem. If you don't believe this, you may find it enlightening to keep top(1) running for a while as you use your machine. Notice how seldom the CPU idle percentage drops below 90%.
If you're building a ULB, go ahead and buy the fastest available processor. Once you've gotten past that gearhead desire for big numbers, pay careful attention to bus speeds and your disk subsystem because that's where you'll achieve serious performance wins. The gap between processor speed and I/O subsystem throughput has only widened in the last five years, so this is even better advice than it was in 1996.
How does all this translate into a recipe in 2001? Get a PCI-bus machine, not a hybrid PCI/ISA design; you sacrifice about 10% of peak performance with those. Get the fastest available front-side (processor-to-memory) bus (in August 2001, the maximum is 266MHz). Get a high-speed SCSI controller and the fastest SCSI disks you can get your hands on.
The case for SCSI is a little less obvious but still compelling. For starters, SCSI is still at least 10%-15% faster than IDE/ATAPI running flat out. Because it's perceived as a professional choice, SCSI peripherals are generally better engineered than IDE/ATAPI equivalents, and new high-performing drive technologies tend to become available in SCSI first. You'll pay a few dollars more, but the cost is well repaid in increased throughput and reliability. Rick Moen comments:
They call me a SCSI bigot. So be it—but my hardware keeps being future-proof: I don't have to run bizarre emulation layers to address CDRs, I never run low on IRQs or resort to IRQ-sharing, all my hard drives have hardware-level hot-fix, all my hard disk/CD/tape/etc., devices support a stable standard rather than this month's cheap extension kludge, and I don't have to worry about adverse interactions at the hardware or driver levels from mixing ATA and SCSI.
Neither Daryll nor I will have IDE in any machine we build either.
To pick the fastest disks, pay close attention to average seek and latency time. The former is an average time required to seek to any track; the latter is the maximum time required for any sector on a track to come under the heads.
Of these, average seek time is much more important. When you're running Linux, a one millisecond faster seek time can make a substantial difference in system throughput. The manufacturers themselves avoid running up seek time on larger-capacity drives by stacking platters vertically rather than increasing platter size. Thus, seek time, which is proportional to the platter radius and head-motion speed, tends to be constant across different capacities in the same product line. This is good because it means you don't have to worry about a capacity vs. speed trade-off.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- SuperTuxKart 0.9.2 Released
- Doing for User Space What We Did for Kernel Space
- Parsing an RSS News Feed with a Bash Script
- Google's SwiftShader Released
- Rogue Wave Software's Zend Server
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide