The Ultimate Linux Lunchbox
Okay, we've built the hardware. Now, what is the software?
In years past, it would have been bproc, as found on the Clustermatic site (see Resources). bproc has a problem, however; it cannot support heterogeneous systems. The very nature of bproc, which requires that process migration works, makes the use of different architectures, in a single system, impossible. We're going to have to use something else. We want to continue using our ThinkPad laptop as the front end; there are no StrongARM laptops that we know of. It's clear that we are going to need new software for our minicluster.
Fortunately, the timing for this move is good. As of 2.6.13, there is now support for the Plan 9 protocol in the standard Linux kernel. This module, called 9p (formerly v9fs), supports the Plan 9 resource-sharing protocol, 9p2000. At the same time this code was being ported to the Linux kernel, Vic Zandy of Bell Labs was working with us on xcpu, a Plan 9 version of bproc. One of the key design goals of xcpu was to support heterogeneous systems. The combination, of 9p in the Linux kernel and xcpu servers ported to Linux, has allowed us to build a replacement system for bproc that supports architecture and operating system heterogeneity. Finally, the introduction of new features in 2.6.13 will allow us to remove some of our custom Clustermatic components and improve others. A key new feature is Eric Biederman's kexec system call, which replaces our kmonte system call.
Figure 14 shows a quick outline of the standard bproc boot sequence, as it works on our miniclusters and clusters with thousands of nodes.
The boot sequence, as shown, consists of LinuxBIOS, Linux, Linux network setup, Linux loading another kernel over the network and Linux using the kmonte system call (part of Clustermatic) to boot that second kernel as the working kernel. Why are there two kernels? In Clustermatic systems, we distinguish the OS we use to boot the system from the OS we run during normal operation. This differentiation allows us to move the working kernel forward, while maintaining the boot kernel in Flash.
The new boot sequence is shown in Figure 15. If it looks simpler, well, it is. We no longer have a “boot kernel” and a “working kernel”. The first kernel we boot will, in most cases, be sufficient. Experience shows that we change kernels on our clusters only every 3–6 months or so. There is no need to boot a new kernel each time. Because the 9p protocol and the xcpu service don't change, and the Master node kernel versions are not tightly tied together, we can separate the version requirements of the Master node and the worker node. We could not make this kind of separation with bproc.
The result is that we can weld the StrongARM boards and the Pentium front end (Master) into one tightly coupled cluster. In fact, we can easily mix 32- and 64-bit systems with xcpu. We can get the effect of a bproc cluster, with more modern kernel technology. Figure 16 shows how we are changing Clustermatic components for this new technology.
In this article, we showed how we built the Ultimate Linux Lunchbox, a 16-node cluster with integral Ethernet switch, in a small toolbox. The cluster is built of hardy PC/104 nodes and can easily survive a drop-kick test and possibly even an airport inspection. The system has only three connectors: one Ethernet, one AC plug and one battery connection.
We also introduced the new Clustermatic software, based around the Plan 9-inspired 9p filesystem, now available in 2.6.13. The new software reduces Clustermatic complexity, and the number of kernel modifications are reduced to zero.
Although there was not room to describe this new software in this article, you can watch for its appearance at clustermatic.org; or, alternatively, come see us at SC 2005 in November, where we will have a mixed G5/PowerPC/StrongARM/Pentium cluster running, demonstrating both the new software and the Ultimate Linux Lunchbox.
This research was funded in part by the Mathematical Information and Computer Sciences (MICS) Program of the DOE Office of Science and the Los Alamos Computer Science Institute (ASCI Institutes). Los Alamos National Laboratory is operated by the University of California for the National Nuclear Security Administration of the United States Department of Energy under contract W-7405-ENG-36. Los Alamos, NM 87545 LANL LA-UR-05-6053.
Resources for this article: /article/8533.
Ron Minnich is the team leader of the Cluster Research Team at Los Alamos National Laboratory. He has worked in cluster computing for longer than he would like to think about.
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Dynamic DNS—an Object Lesson in Problem Solving
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
- Roll your own dynamic dns
3 hours 43 min ago
- Please correct the URL for Salt Stack's web site
6 hours 54 min ago
- Android is Linux -- why no better inter-operation
9 hours 10 min ago
- Connecting Android device to desktop Linux via USB
9 hours 38 min ago
- Find new cell phone and tablet pc
10 hours 36 min ago
12 hours 5 min ago
- Automatically updating Guest Additions
13 hours 14 min ago
- I like your topic on android
14 hours 30 sec ago
- This is the easiest tutorial
20 hours 36 min ago
- Ahh, the Koolaid.
1 day 2 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?