The Roadrunner Supercomputer: a Petaflop's No Problem

IBM and Los Alamos National Lab built Roadrunner, the world's fastest supercomputer. It not only reached a petaflop, it beat that by more than 10%. This is the story behind Roadrunner.
Linux Was a Natural Fit

No, I didn't forget to mention that Roadrunner runs on Linux—Red Hat to be exact. From the beginning, the LANL team knew it wanted Linux due to the open nature of its mission and what it sought to accomplish. IBM's Grice added that LANL “has always been interested in Linux things, so it was a natural fit. We did think about [other operating systems] but we didn't think very hard.”

Technically, Linux was a good fit too. The teams didn't need to concern themselves about running either the Cell processor or the LS21 blade server, nor is scalability a major issue, as it didn't come down at the node level. Rather, it is about using all of the nodes together, which means a low level of strain on the operating system. IBM's Linux Technology Center was instrumental in making Linux work on the Roadrunner.

Beyond Linux, Grice praised other open-source communities for their “tremendous cooperation”. He explained how they excitedly dived into the unique challenges presented by Roadrunner and its hybrid architecture and surpassed all expectations. Some of the open-source applications include the Moab scheduler and Torque resource manager.

To the surprise of IBM and LANL, most potential software “issues” never turned into problems. However, one challenge presented by open source is the numerous streams that aren't always compatible with each other. Thus, the teams had to hold themselves back in some places and experiment in others to keep a stack that was coherent with itself. Nevertheless, the result was satisfying and scaled effectively.

“The notion that there were separate communities who all pulled together, and then it all locked in together as one whole stack, that I think is a fantastic story”, said Grice.

Keeping the Bird Cool

In general, “power and cooling are second only to the software complexity”, emphasized Grice. Power is the real problem for driving HPC forward. Roadrunner solves these issues through the efficiency of its design. Especially due to the efficiency of the Cell processors, Roadrunner needs only 2.3MW of power at full load running Linpack, delivering a world-leading 437 million calculations per Watt. This result was much better than IBM's official rating of 3.9MW at full load. Such efficiency has placed Roadrunner in third place on the Green 500 list of most efficient supercomputers.

Otherwise, Roadrunner is air-cooled, utilizing large copper heat sinks and variable-speed fans.

What Comes after Roadrunner?

Despite Roadrunner's quantum leap into petascale computing, it is merely the beginning of an exciting trend. IBM's Grice spoke of efforts in Europe to re-invigorate supercomputing there, with plans in the pipeline for multi-petaflop machines on-line by 2010. IBM also is planning in the tens of petaflops with Los Alamos and Sandia National Laboratories, including a 50-petaflop machine slated for delivery in 2012 or 2013. “We're going to have an exaflop in 11 years”, adds Grice, “so we just have to figure out how to power it”. The trend has been amazingly linear, and given the advances in hybrid computing, it likely will continue unabated.

Roadrunner also will raise expectations, and hybrid computing will trickle down, making the once-impossible possible. Climate-change scientists will heap more elements to their models, pharmaceutical companies will model the effects of drugs in the body, and Hollywood's special-effects will become even more mind-blowing.

As this future unfolds, the Roadrunner teams at IBM and Los Alamos National Lab should be confident in their accomplishment of building the world's fastest supercomputer—the first-ever petaflop machine. It was an incredible achievement in planning, hardware, software and logistics that has set the global standard for supercomputing. It will be interesting to see what the team will accomplish next.

James Gray is Linux Journal Products Editor and a graduate student in environmental sciences and management at Michigan State University. A Linux enthusiast since the mid-1990s, he currently resides in Lansing, Michigan, with his wife and cats.

______________________

James Gray is Products Editor for Linux Journal

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

$110 million! And there was

James Artle's picture

$110 million! And there was be debating whether to spend a few thousand on a new laptop! Seriously, all this stuff blows me away; it's hard to imagine how a computer this big and powerful can be built, and why it needs to be this big.

James from Laptop Reviews

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState