Clustering Is Not Rocket Science
In this article, we have given an oversight of the Opteron cluster setup at the University of Queensland. We have described how effective large-scale cluster computing can be managed by a few sysadmins looking over the cluster a couple of hours per week. The success of the cluster deployment has been in part due to the quality open-source Linux tools available for cluster operation, such as the SystemImager imaging suite and the C3 package for remote command execution. We believe there are significant advantages by using these simple tools rather than cluster deployment kits. Those advantages are a highly configurable and easily upgradable system. Our cluster has been extremely reliable, and the biggest source of downtime is the power interruptions we get due to storms typical of a Queensland summer.
As for the future, we may be approaching the time when we need to consider seriously the use of some type of parallel filesystem. We have been lucky so far with our NFS file server, but we had to educate our users about file staging and ask them to treat the file server with a little bit of respect. But for now, it's all systems go.
Resources for this article: /article/9133.
Rowan Gollan is a PhD student at the Centre for Hypersonics, the University of Queensland, Australia. When not researching radiating flows about planetary-entry vehicles, his duties include part-time supervision of the cluster and a few departmental Linux servers.
Andrew Denman is also a PhD student at the Centre for Hypersonics. Andrew's doctorate is about the computation of turbulent compressible flows. He is also the ultimate authority for all happenings on the cluster.
Marlies Hankel is a Postdoctoral Researcher at the Centre for Computational Molecular Science. Marlies represents the interests of the computational scientists and prevents them from being bullied by the engineers. Marlies' current research focus is on quantum dynamics of reactive scattering processes relevant to combustion and atmospheric chemistry.
- Nmap—Not Just for Evil!
- Resurrecting the Armadillo
- High-Availability Storage with HA-LVM
- March 2015 Issue of Linux Journal: System Administration
- Real-Time Rogue Wireless Access Point Detection with the Raspberry Pi
- DNSMasq, the Pint-Sized Super Dæmon!
- Localhost DNS Cache
- Days Between Dates: the Counting
- The Usability of GNOME
- Linux for Astronomers