The Ultimate Linux Lunchbox
Okay, we've built the hardware. Now, what is the software?
In years past, it would have been bproc, as found on the Clustermatic site (see Resources). bproc has a problem, however; it cannot support heterogeneous systems. The very nature of bproc, which requires that process migration works, makes the use of different architectures, in a single system, impossible. We're going to have to use something else. We want to continue using our ThinkPad laptop as the front end; there are no StrongARM laptops that we know of. It's clear that we are going to need new software for our minicluster.
Fortunately, the timing for this move is good. As of 2.6.13, there is now support for the Plan 9 protocol in the standard Linux kernel. This module, called 9p (formerly v9fs), supports the Plan 9 resource-sharing protocol, 9p2000. At the same time this code was being ported to the Linux kernel, Vic Zandy of Bell Labs was working with us on xcpu, a Plan 9 version of bproc. One of the key design goals of xcpu was to support heterogeneous systems. The combination, of 9p in the Linux kernel and xcpu servers ported to Linux, has allowed us to build a replacement system for bproc that supports architecture and operating system heterogeneity. Finally, the introduction of new features in 2.6.13 will allow us to remove some of our custom Clustermatic components and improve others. A key new feature is Eric Biederman's kexec system call, which replaces our kmonte system call.
Figure 14 shows a quick outline of the standard bproc boot sequence, as it works on our miniclusters and clusters with thousands of nodes.
The boot sequence, as shown, consists of LinuxBIOS, Linux, Linux network setup, Linux loading another kernel over the network and Linux using the kmonte system call (part of Clustermatic) to boot that second kernel as the working kernel. Why are there two kernels? In Clustermatic systems, we distinguish the OS we use to boot the system from the OS we run during normal operation. This differentiation allows us to move the working kernel forward, while maintaining the boot kernel in Flash.
The new boot sequence is shown in Figure 15. If it looks simpler, well, it is. We no longer have a “boot kernel” and a “working kernel”. The first kernel we boot will, in most cases, be sufficient. Experience shows that we change kernels on our clusters only every 3–6 months or so. There is no need to boot a new kernel each time. Because the 9p protocol and the xcpu service don't change, and the Master node kernel versions are not tightly tied together, we can separate the version requirements of the Master node and the worker node. We could not make this kind of separation with bproc.
The result is that we can weld the StrongARM boards and the Pentium front end (Master) into one tightly coupled cluster. In fact, we can easily mix 32- and 64-bit systems with xcpu. We can get the effect of a bproc cluster, with more modern kernel technology. Figure 16 shows how we are changing Clustermatic components for this new technology.
In this article, we showed how we built the Ultimate Linux Lunchbox, a 16-node cluster with integral Ethernet switch, in a small toolbox. The cluster is built of hardy PC/104 nodes and can easily survive a drop-kick test and possibly even an airport inspection. The system has only three connectors: one Ethernet, one AC plug and one battery connection.
We also introduced the new Clustermatic software, based around the Plan 9-inspired 9p filesystem, now available in 2.6.13. The new software reduces Clustermatic complexity, and the number of kernel modifications are reduced to zero.
Although there was not room to describe this new software in this article, you can watch for its appearance at clustermatic.org; or, alternatively, come see us at SC 2005 in November, where we will have a mixed G5/PowerPC/StrongARM/Pentium cluster running, demonstrating both the new software and the Ultimate Linux Lunchbox.
This research was funded in part by the Mathematical Information and Computer Sciences (MICS) Program of the DOE Office of Science and the Los Alamos Computer Science Institute (ASCI Institutes). Los Alamos National Laboratory is operated by the University of California for the National Nuclear Security Administration of the United States Department of Energy under contract W-7405-ENG-36. Los Alamos, NM 87545 LANL LA-UR-05-6053.
Resources for this article: /article/8533.
Ron Minnich is the team leader of the Cluster Research Team at Los Alamos National Laboratory. He has worked in cluster computing for longer than he would like to think about.
- Epiq Solutions' Sidekiq M.2
- Android Browser Security--What You Haven't Been Told
- Readers' Choice Awards 2013
- The Many Paths to a Solution
- Nativ Disc
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Synopsys' Coverity
- Writing a Simple USB Driver
- Securing the Programmer
- Returning Values from Bash Functions
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide