The Ultimate Linux Lunchbox

For those of you with carry-on, high-performance computing clusters, please ensure that they are securely stowed underneath the seat in front of you.
DQ Cluster

Figure 4. The DQ cluster featured an Ethernet switch and a colorful carrying strap.

We were able to get an awful lot of development work done on DQ at a meeting in Vegas. The switch improved the throughput of the system, and the package was bombproof (although we avoided using that particular phrase in airport security lines). The hardware was basically the same, although one thing we lost was the integrated ThinkPad power supplies—there was no lid on DQ in which to hide them. Nevertheless, this was quite a nice machine.

Sandia was not asleep at the time. Mitch built Minicluster II, which used much more powerful PIII processors. The packaging was very similar to Minicluster I. Once again, we ported LinuxBIOS to this newer node, and the cluster was built to have one master with one disk and three slaves. The slave nodes booted in 12 seconds on this system. In a marathon effort, we got this system going at SC 2002 about the same time the lights started going out. Nevertheless, it worked.

One trend we noticed with the PIII nodes was increased power consumption. The nodes were faster, and the technology was newer, and the power needed was still higher. The improved fabrication technology of the newer chips did not provide a corresponding reduction in power demand—quite the contrary.

It was no longer possible to build DQ with the PIII nodes—they were just too power-hungry. We went down a different path for a while, using the Advantech PCM-5823 boards as shown in Figure 5. There are four CPU boards, and the top board is a 100Mbit switch from Parvus. This switch is handy—it has five ports, so you can connect it directly to your laptop. We needed a full-size PC power supply to run this cluster, but in many ways it was very nice. We preserved instant boot with LinuxBIOS and bproc, as in the earlier systems.

Figure 5. The Geode minicluster needed a full-size power supply to deal with the demands of Pentium III-based nodes.

As of 2004, again working with Mitch Williams of Sandia, we decided to try one more Pentium iteration of the minicluster and set our hungry eyes on the new ADL855PC from Advanced Digital Logic. This time around, things did not work out as well.

First, the LinuxBIOS effort was made more or less impossible by Intel's decision to limit access to the information needed for a LinuxBIOS port to Intel chipsets. We had LinuxBIOS coming up to a point, and printing out messages, but we never could get the memory controller programmed correctly. If you read our earlier articles on LinuxBIOS (see the on-line Resources), you can guess that the romcc code was working fine, because it needs no memory, but the gcc code never worked. Vague hints in the available documents indicated that we needed more information, but we were unable to get it.

Second, the power demand of a Pentium M is astounding. We had expected these to be low-power CPUs, and they can be low power in the right circumstances, but not when they are in heavy use. When we first hooked up the ADL855PC with the supplied connector, which attaches to the hard drive power supply, it would not come up at all. It turned out we had to fabricate a connector and connect it directly to the motherboard power supply lines, not the disk power supply lines, and we had to keep the wires very short. The current inrush for this board is large enough that a longer power supply wire, coupled with the high inrush current, makes it impossible for the board to come up. We would not have believed it had we not seen it.

Instead of the 2A or so we were expecting from the Pentium M, the current needed was more on the order of 20A peak. A four-CPU minicluster would require 80A peak at 5 VDC. The power supply for such a system would dwarf the CPUs; the weight would be out of the question. We had passed a strange boundary and moved into a world where the power supply dominated the size and weight of the minicluster. The CPUs are small and light; the power supply is the mass of a bicycle.

The Pentium M was acceptable for a minicluster powered by AC, as long as we had large enough tires. It was not acceptable for our next minicluster. We at LANL had a real desire to build 16 nodes into the lunchbox and run it all on one ThinkPad power supply. PC/104 would allow it, in terms of space. The issues were heat and power.

What is the power available from a ThinkPad power supply? For the supplies we have available from recent ThinkPads, we can get about 4.5A at 16 VDC, or 72 Watts. The switches we use will need 18 Watts, so the nodes are left with about 54 Watts between them. This is only 3W per node, leaving a little headroom for power supply inefficiencies. If the node is a 5V node, common on PC/104, then we would like .5A per node or less.

This power budget pretty much rules out most Pentium-compatible processors. Even the low-power SC520 CPUs need 1.5A at 5V, or 7.5 Watts—double our budget. We had to look further afield for our boards.

We settled on the Technologic TS7200 boards for this project. The choice of a non-Pentium architecture had many implications for our software stack, as we shall see.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.


farquatdhth's picture



netnut's picture

Yeah! What a Lunchbox! Amazing what is possible...

Geek Guide
The DevOps Toolbox

Tools and Technologies for Scale and Reliability
by Linux Journal Editor Bill Childers

Get your free copy today

Sponsored by IBM

Upcoming Webinar
8 Signs You're Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
11am CDT, April 29th
Moderated by Linux Journal Contributor Mike Diehl

Sign up now

Sponsored by Skybot