Clustering Is Not Rocket Science

HPC clustering for computing at hypersonic speed is not as difficult as it sounds.

The rocket science involved with designing and developing supersonic combustion ramjets (scramjets) is a tricky business. High-performance Linux clusters are used to aid the study of scramjets by facilitating detailed computations of the gas flow through the scramjet engine. The computational requirements for this and other real-world problems go beyond a few PCs under a desk. Prior to the Linux-cluster age, researchers often had to scale down the problem or simplify the mechanisms being studied to the detriment of the solution accuracy. Now, for instance, entire scramjet engines can be studied at quite high resolution.

In this article, we try to serve two purposes: we describe our experiences as a research group operating a large-scale cluster, and we demonstrate how Linux and companion software has made that possible without requiring specialist HPC expertise. As HPC Linux clustering has matured, it has become an aid to rocket science, without needing to be rocket science itself.

That last statement probably requires some clarification. When clusters were first built, they were heralded for offering unbeatable performance per dollar or “bang for buck”, as the phrase goes. However, as you tried to scale up to large numbers of nodes, the operation of a large-scale cluster started to become quite complex. For a number of reasons, including lack of clustering software tools, large clusters required a full-time system administrator. We argue that this situation has changed now thanks to simple effective tools written for Linux that are aimed at cluster operation and management.

In June 2004, two research groups at the University of Queensland, the Centre for Hypersonics and the Centre for Computational Molecular Science, teamed up to purchase a cluster of 66 dual-Opteron nodes from Sun Microsystems. The people at Sun were generous enough to sponsor two-thirds of the cost of the machine. A grant from the Queensland Smart State Research Facilities Fund covered the remaining third of the machine cost. Additionally, the University of Queensland provided the infrastructure, such as the air conditioning and specially designed machine room. We suddenly faced the challenge, albeit a pleasant one, of operating a 66-node cluster that was an order of magnitude larger than our previous cluster of five or six desktops. We didn't have the resources to obtain expensive proprietary cluster control kits, nor did we have the experience or expertise in large-scale cluster management. We were, however, highly aware of the advantages Linux offered in terms of cost, scalability, flexibility and reliability.

We emphasise that the setup we arrived at is a simple but effective Linux cluster that allowed the group to get on with the business of research. In what follows, we discuss the challenges we faced as a research group scaling up to a large-scale cluster and how we leveraged open-source solutions to our advantage. What we have done is only one solution to cluster operation, but one that we feel offers flexibility and is easy for research groups to implement. We should point out that expensive cluster control kits with all the bells and whistles weren't an option for us with our limited budget. Additionally, at the time of initial deployment, the open-source Rocks cluster toolkit wasn't ready for our 64-bit Opteron hardware, so we needed to find a way of using the newest kernel that was 64-bit ready. The attraction of packaged cluster deployment kits is that they hide some of the behind-the-scene details. The disadvantages can be that the cluster builder is locked in to a very specific way of using and managing the cluster, and it can be hard to diagnose problems when things go wrong. In setting up our cluster, we've held to the UNIX maxim of “simple tools working together”, and this has given us a setup that is highly configurable, easily maintained and has a transparency of operation.

Laying the Foundations

When building a cluster of five nodes, our IT administrator had given us five IP addresses on the network. That was easy—our machines had an IP, and we left the details of security and firewalling to our network administrator. Now with 66 nodes plus front-end file servers and another 66 service processors each requiring an IP address, it was clear we'd have to use a private network. Basically, our IT administrator didn't want to know us and mumbled something vague about us trying a network address translation (NAT) firewall. So that's what we did; we grabbed an old PC and installed Firestarter and created a firewall for our cluster in about half an hour. Firestarter provides an intuitive interface to Linux's iptables. We created our NAT firewall and were able to forward a few ports through to the front ends allowing SSH access.

With the network topology sorted, the next challenge was installing the operating system on all 66 of the servers. Previously, we had been happy to spend a morning swapping CDs in and out of drives in order to install the OS on a handful of machines. We quickly realised we would require some kind of automated process to deal with 66 nodes. We found that the SystemImager software suite did exactly what we were looking for. Using the SystemImager suite, we needed to install the OS only on one node. After toying with the configuration of that node, we had our golden client, as they call it in SystemImager parlance, ready to go. The SystemImager tools allowed us to take an image and push out the image when required. We also required a mechanism to do OS installs over the network so that we could avoid CD swapping. One of the SystemImager scripts helped us to set up a Pre-boot Execution Environment (PXE) server. This execution environment is handed out to nodes during bootup and allows the nodes to proceed with a network install. With this minimal environment, the nodes partition their disks and then transfer the files that comprise the OS from the front-end server. For the record, we use Fedora Core 3 on the cluster. The choice was motivated by our own familiarity with that distribution and the fact that it is close enough to Red Hat Enterprise Linux that we are able to run the few commercial scientific applications that are installed on the cluster.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

This article is a step back

Anonymous's picture

This article is a step back in time with respect to cluster management. I'm shocked the editors published the article, but it speaks to the change in editorial staff. At least the science is interesting.