Clustering Is Not Rocket Science
As we are interested in massively parallel computations, we needed to configure the servers to communicate with each other. We installed lam-mpi to use as a message-passing interface, and we configured the SSH service on each node to allow passwordless access between nodes by using host-based authentication. Note that lam-mpi doesn't do all the work of parallelizing your application; you still need to write or have available an MPI-aware code.
We configured an NFS (Network File System) server to provide a shared filesystem for all of the cluster compute nodes. We share the home directories of users across all nodes and some of the specialist applications we use for scientific computing. User accounts are managed by the Network Information Service (NIS) that comes standard with most Linux distributions.
Previously, our computational group was about four people sharing time on five nodes. We had an extremely reliable job-scheduling system that involved a whiteboard and some marker pens. Clearly, this method of job scheduling would not scale as we expanded to a user base of about 40 users. We chose the Sun Grid Engine scheduling and batch processing software to install on the cluster.
The other challenge with the expanded user base was that the majority of users had limited experience with an HPC facility and little or no experience using Linux. We decided that one of the best ways to share information about using the cluster was through the use of a wiki page. We set up a wiki page with the MediaWiki package. The wiki page has all manner of information about the cluster—from basic newbie-type information about copying files onto the cluster to advanced usage information about various compilers. The wiki page has been useful in bridging the knowledge gap between the sysadmins and the newbie users. The wiki allows for inexperienced cluster users to modify the documentation to make it simpler for other new players and also add neat tricks they may have devised themselves. The dynamic nature of a wiki page is a clear advantage when it comes to keeping documentation about the cluster facility up to date.
The second purpose of the wiki is to maintain an administrator's log of work on the cluster. As we sit in separate buildings, it was not practical to keep a traditional (physical) logbook. Instead, we use the wiki page to keep each other abreast of changes to the cluster. We actually keep this part of the Web page password-protected to ensure against any wiki vandalism.
Sometimes it is necessary to issue commands on every node of the cluster or copy some files onto all nodes. Again, this wasn't a problem with five or six machines—we'd simply log in to the machines individually and do whatever was necessary. But with 66 machines, logging in to each machine individually becomes both tedious and error-prone. Our solution here was to use the C3 package developed by the group at the Oak Ridge National Laboratory. C3 stands for cluster command and control. It provides a set of Python scripts that allows for remote execution of commands across the cluster. There is also a tool to allow for copying files to groups of compute nodes. This is a Python script that uses rsync to do the transfer.
Speaking of Python scripts, we have found Python to be a useful all-purpose scripting language for cluster work. The particular attraction to Python is its sophisticated support for string manipulation. This allows us to take the text-based output from a number of standalone programs and parse it into more meaningful information. For example, the queuing system provides some detailed information about the status of the cluster, such as available processors on each node and queue availability on each node. Using Python, we can take the detailed output from such a command and provide some summary statistics that give us an indication of cluster load at a glance. Another example of Python scripts in action is our monitoring of temperatures on the compute nodes. This script is displayed in Listing 1. Python's ease of string handling and access to system services come in handy for many scripting tasks on the cluster.
Listing 1. Temperature Monitoring with Python Scripting
---- monitor_temp.py ----
SHUTOFF_TEMP = 50.0
mail_list = ['fake.email@fakedomain.com']
import os, re, smtplib
def send_mail(toaddrs, msg):
fromaddr = 'sysadmin@fakedomain.com'
server = smtplib.SMTP('smtp.uq.edu.au')
server.sendmail(fromaddr, toaddrs, msg)
server.quit()
f = os.popen('hostname', "r")
hostname = f.readline().split()[0]
svc_proc = hostname[:4] + 'sp' + hostname[4:]
f.close()
f = os.popen("ipmitool -I lan -P password -H %s sensor | grep
'cpu[0-1].memtemp'" % svc_proc, "r")
mail_sent = False
for line in f.readlines():
if mail_sent:
break
tokens = line.split()
str = tokens[2]
if str == 'NA':
temp = 1.0
else:
temp = float(str)
if temp >= SHUTOFF_TEMP:
msg = 'Re: hot temperature initiated shutdown for %s\n' %
hostname
msg += 'The CPU memtemp for %s exceeded %.1f.\n' % (hostname,
SHUTOFF_TEMP)
msg += 'This node has been shutdown.\n'
for address in mail_list:
send_mail(address, msg)
# Clean up scratch and power down
os.system('rm -rf /scratch/*')
os.system('/sbin/shutdown now -h')
mail_sent = True
The temperature monitoring script makes use of the intelligent platform management interface (IPMI). By using the IPMI specification, we had an implementation of a monitoring subsystem that permitted fully remote and customizable management of the compute nodes. Each compute node came equipped with a PowerPC service processor that communicated on a separate network from the main cluster. By combining the power of the open-source tools of Python and IPMItool, we created a totally autonomous thermal monitoring system. The system can shut down individual compute nodes if they exceed a predetermined temperature or cut the power if the server doesn't respond to a shutdown command. An e-mail is also sent, using the Python smtplib, to the admin team to advise of the situation.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- Designing Electronics with Linux
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Linux Systems Administrator
- Dynamic DNS—an Object Lesson in Problem Solving
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Using Salt Stack and Vagrant for Drupal Development
- Reply to comment | Linux Journal
5 hours 16 min ago - Dynamic DNS
5 hours 50 min ago - Reply to comment | Linux Journal
6 hours 49 min ago - Reply to comment | Linux Journal
7 hours 39 min ago - Not free anymore
11 hours 41 min ago - Great
15 hours 28 min ago - Reply to comment | Linux Journal
15 hours 36 min ago - Understanding the Linux Kernel
17 hours 51 min ago - General
20 hours 20 min ago - Kernel Problem
1 day 6 hours ago






Comments
This article is a step back
This article is a step back in time with respect to cluster management. I'm shocked the editors published the article, but it speaks to the change in editorial staff. At least the science is interesting.