Linux Clusters at NIST
The National Institute of Standards and Technology (NIST) is experimenting with clusters based on commodity personal computers and local area network technology. The purpose of the experimental phase of the cluster project is to determine the viability of using commodity clusters for some of the NIST parallel computing workload. In addition to a Cray C90 running sequential codes, many parallel jobs are run on IBM SP2, SGI Origin 2000, SGI Onyx and Convex supercomputers.
Building clusters to run parallel jobs is done for several reasons. Low initial cost is one, although it is not clear how the long-term costs associated with clusters compare with those of traditionally packaged systems supported by manufacturers. Another reason is availability of systems. Having a slower system available without needing to wait is sometimes better than waiting a long time for access to a faster system. A third reason is that free implementations of the parallel virtual machine (PVM) and message passing interface (MPI) environments are available for Linux. Many NIST parallel applications rely on PVM or MPI.
Figure 1 shows a diagram of the current cluster. Currently, there are 48 machines (nodes) in the cluster; 32 nodes have a 200 MHz Pentium-Pro microprocessor, and 16 nodes have a 333 MHz Pentium II microprocessor. The Pentium-Pro machines are built around the Intel VS440FX (Venus) motherboard, which uses a 33 MHz PCI bus and has four SIMM memory slots. The Pentium II machines use a motherboard based on the Intel 440LX chip set, also with a 33 MHz PCI bus, and 3 DIMM slots for memory. All of the nodes are configured with a single 2.1GB hard disk supporting 512MB of swap space and the file systems. Sixteen nodes have 128MB of memory and 32 nodes have 256MB, for an aggregate total of 10GB of memory.
Sixteen of the cluster machines are connected with both Fast-Ethernet and ATM. The remaining 32 machines are connected with Fast-Ethernet only. Therefore, the cluster can be configured with up to 48 Fast-Ethernet nodes or up to 16 ATM nodes, depending on the requirements of the job to be run. One other Pentium-Pro machine is used as an administrative front end to the cluster. This machine has both Fast-Ethernet and ATM interfaces.
The ATM interface cards are Efficient Networks ENI-155p with 512KB of memory onboard. These nodes are connected to a Fore ASX-1000 switch over OC-3 (155Mbps), using multi-mode fiber. For Fast-Ethernet, we use SMC EtherPower 10/100 and Intel EtherExpress 100+ cards connected to one Ethernet switch: an N-Base MegaSwitch 5000 with 60 Fast-Ethernet ports. The cabling to the switch is done with Category 5 twisted-pair cable. The Ethernet switch is connected to the intra-NIST network via a 100Mbps uplink to a router. Both the ATM and Fast-Ethernet interfaces are configured into different subnets, allowing us to monitor the network traffic independently. There is very little background traffic on the cluster subnets, since the nodes are used only for parallel application programs. Keeping non-computational network traffic to a minimum is important when evaluating the cluster network.
The cluster nodes have been augmented with the NIST-developed MultiKron performance measurement instrumentation (see Resources 6 and 7). The MultiKron PCI board is equipped with a MultiKron VLSI chip, a high precision clock and 16MB of memory. These features allow for precise interval measurement and storage of trace data with little perturbation. Another advantage of MultiKron is that the clocks on several boards (up to 16 at present) can be time-synchronized, allowing for tracing of events across the cluster with 25-nanosecond resolution. This type of measurement is important for precise tracing of network events.
The 32 Fast-Ethernet-only cluster nodes are running Linux kernel version 2.0.29. We've found this version to be the most stable for our configuration. However, because support for the Intel EtherExpress 100+ is not in this kernel release, the device driver is built as a module. The 16 ATM/Ethernet nodes use Linux kernel 2.1.79, as this release is required for the ATM software version in use.
Development and support for the ATM software comes from the Linux-ATM project run by Werner Almesberger at the Swiss Federal Institute of Technology (EPFL). (See Resources 2.) We are currently running version 0.34 of this software, having started with version 0.26.
We installed Local Area Multicomputer (LAM) (an implementation of MPI) version 6.1 and Parallel Virtual Machine (PVM) version 3.10 on the cluster in order to run our benchmarks in addition to the NIST parallel jobs.
We also developed a device driver to allow for user-mode programs to control the performance counters present in the Intel Pentium-Pro and Pentium II processors. Two performance counters are present in the Intel Pentium-Pro architecture, along with a timestamp counter. Each counter can be configured to count one of several events, such as cache fetches and instruction executions. (See Resources 3.) The device driver is required because writing to the counter control registers (and the counters themselves) can be done only by the Linux kernel. User-mode programs can directly read counter values without incurring the overhead of a kernel system call to the device driver.
Another tool we use on the cluster is S-Check (see Resources 5), developed by our group. S-Check is a highly automated sensitivity analysis tool for programs. It predicts how refinements in parts of a program will affect performance by making local changes in code efficiencies and correlating these against overall program performance.
We have written many small test kernels to evaluate the performance of communication within the cluster. We have versions of the test kernels that communicate at the raw socket, IP and LAM/PVM library levels. These small kernels are useful in evaluating the overhead of the different communication software levels. By using the MultiKron toolkit, the kernels obtain very precise measurements of network performance.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Back to Backups
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Google's Abacus Project: It's All about Trust
- A New Version of Rust Hits the Streets
- Secure Desktops with Qubes: Introduction
- Seeing Red and Getting Sleep
- Fancy Tricks for Changing Numeric Base
- Secure Desktops with Qubes: Installation
- Working with Command Arguments
- Linux Mint 18
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide