The Roadrunner Supercomputer: a Petaflop's No Problem

IBM and Los Alamos National Lab built Roadrunner, the world's fastest supercomputer. It not only reached a petaflop, it beat that by more than 10%. This is the story behind Roadrunner.
Linux Was a Natural Fit

No, I didn't forget to mention that Roadrunner runs on Linux—Red Hat to be exact. From the beginning, the LANL team knew it wanted Linux due to the open nature of its mission and what it sought to accomplish. IBM's Grice added that LANL “has always been interested in Linux things, so it was a natural fit. We did think about [other operating systems] but we didn't think very hard.”

Technically, Linux was a good fit too. The teams didn't need to concern themselves about running either the Cell processor or the LS21 blade server, nor is scalability a major issue, as it didn't come down at the node level. Rather, it is about using all of the nodes together, which means a low level of strain on the operating system. IBM's Linux Technology Center was instrumental in making Linux work on the Roadrunner.

Beyond Linux, Grice praised other open-source communities for their “tremendous cooperation”. He explained how they excitedly dived into the unique challenges presented by Roadrunner and its hybrid architecture and surpassed all expectations. Some of the open-source applications include the Moab scheduler and Torque resource manager.

To the surprise of IBM and LANL, most potential software “issues” never turned into problems. However, one challenge presented by open source is the numerous streams that aren't always compatible with each other. Thus, the teams had to hold themselves back in some places and experiment in others to keep a stack that was coherent with itself. Nevertheless, the result was satisfying and scaled effectively.

“The notion that there were separate communities who all pulled together, and then it all locked in together as one whole stack, that I think is a fantastic story”, said Grice.

Keeping the Bird Cool

In general, “power and cooling are second only to the software complexity”, emphasized Grice. Power is the real problem for driving HPC forward. Roadrunner solves these issues through the efficiency of its design. Especially due to the efficiency of the Cell processors, Roadrunner needs only 2.3MW of power at full load running Linpack, delivering a world-leading 437 million calculations per Watt. This result was much better than IBM's official rating of 3.9MW at full load. Such efficiency has placed Roadrunner in third place on the Green 500 list of most efficient supercomputers.

Otherwise, Roadrunner is air-cooled, utilizing large copper heat sinks and variable-speed fans.

What Comes after Roadrunner?

Despite Roadrunner's quantum leap into petascale computing, it is merely the beginning of an exciting trend. IBM's Grice spoke of efforts in Europe to re-invigorate supercomputing there, with plans in the pipeline for multi-petaflop machines on-line by 2010. IBM also is planning in the tens of petaflops with Los Alamos and Sandia National Laboratories, including a 50-petaflop machine slated for delivery in 2012 or 2013. “We're going to have an exaflop in 11 years”, adds Grice, “so we just have to figure out how to power it”. The trend has been amazingly linear, and given the advances in hybrid computing, it likely will continue unabated.

Roadrunner also will raise expectations, and hybrid computing will trickle down, making the once-impossible possible. Climate-change scientists will heap more elements to their models, pharmaceutical companies will model the effects of drugs in the body, and Hollywood's special-effects will become even more mind-blowing.

As this future unfolds, the Roadrunner teams at IBM and Los Alamos National Lab should be confident in their accomplishment of building the world's fastest supercomputer—the first-ever petaflop machine. It was an incredible achievement in planning, hardware, software and logistics that has set the global standard for supercomputing. It will be interesting to see what the team will accomplish next.

James Gray is Linux Journal Products Editor and a graduate student in environmental sciences and management at Michigan State University. A Linux enthusiast since the mid-1990s, he currently resides in Lansing, Michigan, with his wife and cats.

______________________

James Gray is Products Editor for Linux Journal

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

$110 million! And there was

James Artle's picture

$110 million! And there was be debating whether to spend a few thousand on a new laptop! Seriously, all this stuff blows me away; it's hard to imagine how a computer this big and powerful can be built, and why it needs to be this big.

James from Laptop Reviews

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix