The Ultimate Linux Lunchbox

 in
For those of you with carry-on, high-performance computing clusters, please ensure that they are securely stowed underneath the seat in front of you.

In this article, we describe the construction of the Ultimate Linux Lunchbox, a 16-node cluster that runs from a single IBM ThinkPad power supply but can, as well, run from an N-charge or similar battery. The lunchbox has an Ethernet switch built-in and has only three external connections: one AC plug, one battery connector and one Ethernet cable. To use the lunchbox with your laptop, you merely need to plug the Ethernet cable in to the laptop, supply appropriate power—even the power available in an airplane seat will do—and away you go, running your cluster at 39,000 feet. We've designed the lunchbox so that we can develop software on it, as a private in-office cluster or a travel cluster. The lunchbox is an example of a newer class of clusters called miniclusters.

Miniclusters

Miniclusters were first created by Mitch Williams of Sandia/Livermore Laboratory in 2000. Figure 1 shows a picture of his earliest cluster, Minicluster I. This cluster consisted of four Advanced Digital Logic boards, using 277MHz Pentium processors. These boards had connectors for the PC/104+ bus, which is a PC/104 bus with an extra connector for PCI.

As you can see, there are only four nodes in this cluster. The base of the cluster is the power supply, and the cluster requires 120 Volts AC to run. We also show a single CPU card on the right. The green pieces at each corner form the stack shown in the pictures. A system very much like this one is now sold as a product by Parvus Corporation.

Figure 1. Minicluster I used four Pentium-based single-board computers (courtesy Sandia National Labs).

Figure 2. One Node of Minicluster I (courtesy Sandia National Labs)

The Bento Series

We were intrigued by this cluster and thought it would be an ideal platform for Clustermatic. In the summer of 2001, we ported LinuxBIOS to this card and got all the rest of the Clustermatic software running on it. When we were done, we had a card that booted to Linux in a few seconds, and that booted into full cluster mode in less than 20 seconds. Power and reset cycles ceased to be a concern.

We provided the LinuxBIOS and other software to Mitch, and he modified Minicluster I to use it. Mitch was able to remove three disks, reducing power and improving reliability. One node served as the cluster master node, and three other nodes served as slave nodes.

Inspired by Mitch's work, we built our first Bento cluster in 2002. In fact, the lunchbox used for that system is the one we use for the Ultimate Linux Lunchbox. This system had seven CPU cards. It needed two power supplies, made by Parvus, which generate the 5V needed for the CPU cards and can take 9–45 VDC input. It had a built-in Ethernet hub, which we created by disassembling a 3Com TP1200 hub and putting the main card into the lid. This cluster used three IBM ThinkPad power supplies. Two of the supplies are visible in the lid, on either side of the Ethernet hub. The third is visible at the back of the case. One supply drives the hub, the other two drive each of the two supplies. The supplies and fan board for each supply can be seen at the far right and left of the box; the seven CPU boards are in the middle.

Figure 3. The First Lunchbox Cluster, Bento

Bento was great. We could develop on the road, in long and boring meetings and test on a seven-node cluster. Because the reboot time was only 15 seconds or so for a node at most, testing out modules was painless. In fact, on this system, compiling and testing new kernel modules was about as easy as compiling and testing new programs. Diskless systems, which reboot really quickly, forever change your ideas about the difficulty and pain of kernel debugging.

During one particularly trying meeting in California, we were able to revamp and rewrite the Supermon monitoring system completely, and use it to measure the impact of some test programs (Sweep3d and Sage) on the temperature of the CPUs as it ran. Interestingly enough, compute-intensive Fortran programs can ramp up the CPU temperature several degrees centigrade in a few seconds. The beauty of these systems is that if anyone suspects you are getting real work done, instead of paying attention to the meeting, you always can hide the lunchbox under your chair and keep hacking.

Bento used a hub, not a switch, and Erik Hendriks wanted to improve the design. The next system was called DQ. DQ was built in to an attractive metal CD case, suitable for carrying to any occasion, and especially suitable for long and boring meetings. As our Web page says, we'll let you figure out the meaning of the name. Hint: check out the beautiful pink boa carrying strap in the picture.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

yes

farquatdhth's picture

cool

WOW

netnut's picture

Yeah! What a Lunchbox! Amazing what is possible...

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix