Clusters for Nothing and Nodes for Free

When the users are away, your company's legacy desktop systems can become a powerful temporary Linux cluster.

For years, our LTSP deployment has been providing multiple X stations to various engineering computers, and we never needed a central application server. The script shown in Listing 6 builds a floppy image for use with all computers. The user simply specifies the network card model.

With this infrastructure, any cluster user can stroll through the buildings with one of those floppies and reboot idle machines into the cluster until sufficient resources are available to run workloads efficiently. For logic simulation, Alex simply adds machines until there are more fast computers in the cluster than slow tests in the suite, so the regression never takes longer than 16 minutes. With that efficiency boost, he rapidly finished the design. Without running mtop, you'd never notice OpenMosix migrating compute-bound processes into the cluster. Meanwhile, others are using the network for different projects.

Large Off-Peak Cluster

Quantum Magnetics has about 100 employees, so our cluster is limited to around 100 nodes, as a few people have more than one computer. We're setting things up so that machines spend nights in the cluster and days as normal user workstations. They reboot at least twice every day and check a configuration directory to decide whether to boot from the network or from the hard drive.

The BIOS must be configured to try the PXE boot before the hard drive. The DHCP servers distinguish between EtherBoot and PXE boot requests, with the latter receiving the boot filename for PXELINUX. There are two directories of configuration files, one for day and one for evening, and a small cron job to switch between them. The daytime boot chains to the master boot record on the hard drive, and the evening boot chains to the PXE version of EtherBoot.

The LTSP configuration file indicates which machines have to reboot on weekday mornings and causes the ctrlaltdel script to run. If a user comes to work early, simply pressing Ctrl-Alt-Del brings the machine back into daytime mode as soon as possible.

Remote Windows administration is used to force workstations to log off after inactivity in the evening and then reboot once. If either of the two network boot stages fail, the machine starts Windows and does not join the cluster.

Long-Term Use

Once your on-demand cluster is running smoothly, resist the temptation to increase it by purchasing a lot of desktop computers you don't otherwise need. The use of LTSP with desktop computers is cost effective only because you already paid for them. There is no financial outlay to acquire them, install them or maintain them when any of their components fail. Dedicated multiprocessor rackmount computers are easily the cheapest way to add processing power to a cluster. By omitting the unnecessary peripherals, they also save money, power, cooling and some failures.

OpenMosix or Mosix offer a quick and easy way to get cluster benefits, but the kernel is making migration decisions in real time. It is inherently less efficient than using explicit workload management with processes dedicated to individual nodes and communicating using MPI. Because you can support both Mosix and MPI within the same cluster, you may want to add job control and MPI libraries to the LTSP client filesystem. Applications that are cluster-aware take advantage of MPI and achieve the ultimate performance available. The other applications always gain partial benefits from Mosix.

On a dual-MPI/Mosix cluster, users have the incentive to migrate to MPI applications. The load balancing algorithms of Mosix always give priority to a local MPI process over a migrated Mosix process, so cluster-unaware applications run more slowly. We haven't started using MPI yet, because none of our critical engineering applications would benefit from it enough to justify the effort needed to establish it.