The Roadrunner Supercomputer: a Petaflop's No Problem
In 1995, the French threw the world into an uproar. Their testing of a nuclear device on Mururoa Atoll in the South Pacific unleashed protests, diplomatic friction and a boycott of French restaurants worldwide. Thanks to many developments—among them Linux, hardware and software advances and many smart people—physical testing has become obsolete, and French food is back on the menu. These developments are manifested in Roadrunner, currently the world's fastest supercomputer. Created by IBM and the Los Alamos National Laboratory (LANL), Roadrunner models precise nuclear explosions and other aspects of our country's aging nuclear arsenal.
Although modeling nuclear explosions is necessary and interesting to some, the truly juicy characteristic of the aptly named Roadrunner is its speed. In May 2008, Roadrunner accomplished the almost unbelievable—it operated at a petaflop. I'll save you the Wikipedia look-up on this one: a petaflop is one quadrillion (that's one thousand trillion) floating-point operations per second. That's more than double the speed of the long-reigning performance champion, IBM's 478.2-teraflop Blue Gene/L system at Lawrence Livermore National Lab.
Besides the petaflop achievement, the story behind Roadrunner is equally incredible in many ways. Elements such as Roadrunner's hybrid Cell-Opteron architecture, its applications, its Linux and open-source foundation, its efficiency, as well as the logistics of unifying these parts into one speedy unit, make for a great story. This being Linux Journal's High-Performance Computing issue, it seems only fitting to tell the story behind the Roadrunner supercomputer here.
“You want to talk about challenges; it is the logistics of dealing with this many pounds of stuff”, said Don Grice, IBM's lead engineer for Roadrunner. “The folks in logistics have it down to a science.” By the time you read this, the last of Roadrunner's 17 sections with 180 compute nodes—250 tons of “stuff” on 21 semitrucks—will have left IBM's Poughkeepsie, New York, facility, bound for Los Alamos National Laboratory's Nicholas Metropolis Center in New Mexico.
The petaflop accomplishment occurred at “IBM's place”, where the machine was constructed, tested and benchmarked. In reality, Roadrunner achieved 1.026 petaflops—merely 26 trillion additional floating-point calculations per second beyond the petaflop mark. Roadrunner's computing power is equivalent to 100,000 of today's fastest laptops.
The Roadrunner is one of the most complex projects undertaken by both IBM and its partners. IBM produced each of Roadrunner's two server blades in two different locations and assembled them into so-called tri-blades in a third. The tri-blades then were shipped to Poughkeepsie to become part of Roadrunner. Despite this logistical hurdle, the project was completed on schedule and at budget.
IBM also had to find partners for the entire interconnect fabric, make it scale and obtain the desired performance. The company also worked with various Linux and other open-source communities to build a coherent software stack. Fears that the high level of coordination among partners, such as Emcore, Flextronics, Mellanox and Voltaire, wouldn't work out were proved unfounded. “They all pulled together in a tremendous, tremendous way”, said Grice. “There isn't any aspect of the machine that isn't doing what it was supposed to.”
Of course, a project of Roadrunner's magnitude requires many smart people at both IBM and LANL, who have collaborated for six years to develop and build it. The team at LANL consisted of 171 people, with a group of similar size on the IBM side. “Los Alamos and IBM have formed a very close partnership”, commented Andy White, LANL's Project leader. “We have been able to work together to work though many problems”, he added.
According to White, the project planning began in 2002, when LANL decided to pursue supercomputers with accelerators (in the end, the Cell processors) to achieve its modeling needs. They had begun hearing about the Cell processor and were intrigued about the potential for its applications. LANL determined that it essentially wanted a very large Linux cluster and realized that with the accelerators, they could reach the petaflop.
IBM and LANL jointly worked on Roadrunner's overall design; IBM implemented the code—that is, the computational library (ALF) and the arithmetic software (DaCS) for hybrid systems. The Los Alamos group was tasked with ensuring that its applications would run on the machine. The system modeling group spent an entire year analyzing its applications and the performance characteristics of the machine, making sure that both LANL's classified work and all kinds of interesting science applications would work well. The group built four applications related to its nuclear-physics modeling. These include the Implicit Monte Carlo (IMC) code (the Milagro application suite) for simulation of thermal radiation propagation, the Sweep3D kernel, the SpaSM molecular dynamics code and the VPIC particle and cell plasma physics code. LANL's White says that these applications were the basis for asking the question “Can we program [Roadrunner] and can we get accelerated performance on this system?”
There were no shortage of significant intellectual challenges to making Roadrunner do its job. One was to prove that the aforementioned applications could run on an accelerated Roadrunner—without having it to run on! In September 2006, IBM delivered a base system to LANL for testing, but without accelerators. The applications could be tested on the Cell processor but not on the complete node or system. White explained:
The performance and architecture laboratory team actually was able to model the entire system [complete with acceleration] and predict pretty much dead-on what has happened when the code was run on the full system. The fact that we were able to pass two serious technical assessments in October  and show people that we can program the machine, the codes can get good speed up, they're accelerated, and we can manage the machine, etcetera, without actually having the machine on hand, I think was a tour de force.
Another challenge involved the networking. While working with the base system in late 2006 and early 2007, the concern arose that Roadrunner's computing horsepower would cause the network to be a bottleneck. Thus, said White, “the nodes were redesigned in-flight”, with the new ones offering a 400% increase in performance from the Opteron to the Cell processors, as well as out to the network, vis-à-vis the original design. All of this was done at the same original contract price.
The $110 million Roadrunner was completed on schedule, just in time to qualify for the June 2008 edition of the Top 500 list of the world's most powerful computer systems.
James Gray is Products Editor for Linux Journal.