LinuxBIOS at Four
I started the LinuxBIOS Project at Los Alamos National Lab (LANL) in September 1999. For the prior eight years, I had been building clusters of all kinds and had built my first PC cluster in 1994. In all this time, the BIOS had been a stumbling block in constructing larger clusters.
In 1997, I built the 144-node Cyclone cluster at the Sarnoff Corporation. As an experiment, we had only 16 nodes with video. The experiment was not successful; PCs using the standard BIOS simply are too unreliable to have the video removed, because PC failure recovery always requires interaction with the BIOS. It was clear that if we were to move to ever-larger PC clusters, we needed to resolve the problems of the BIOS.
We decided the ideal PC cluster node would have the following capabilities: boots directly into an OS from some onboard, nonvolatile RAM; configures all the network interfaces but configures no other hardware; connects to a control node using any working network interface; and takes action only at the direction of the control node.
Private industry was not the place to move on this kind of work, however, so we never were able to take these ideas past the talking stage.
Once I got to LANL, I had the ability to pursue these ideas. Several technology trends also made 1999 a far better year than 1997 to look at this problem. In 1999, motherboards with 1MB of Flash were appearing, and the self-describing PCI bus had replaced the older EISA and ISA buses completely. Also important, Linux was becoming much better at doing more configuration, as exemplified by the SGI Visual Workstation, which didn't even have a standard BIOS.
It seemed clear that if we could put Linux in the BIOS part, we could achieve our goals. Linux can do a far better job of running the hardware than any BIOS we have seen. What we needed was a simple hardware bootstrap that loaded Linux from Flash into memory; Linux would do the rest. Hence, our early motto, “Let Linux do it!”
Before we got the LinuxBIOS Project going full steam, we needed to ensure that Linux could be used as an OS bootstrap, which meant that Linux had to be able to boot Linux. By December 1999, we had demonstrated Linux booting Linux with the LOBOS work.
The easiest way to get work done in an Open Source world is to let somebody else do it for you, so the next step in LinuxBIOS was to look for somebody else's software. James Hendricks and Dale Webster found such a system in the OpenBIOS Project. In the space of five days, starting with the OpenBIOS source, they wrote and built a test system on our Intel L440GX+ motherboards that could boot the system from reset—not power on, but reset. Starting from power on would take another five months to figure out, but it wasn't bad work for five vacation days.
We realized early on that assembly code could not be the future of LinuxBIOS. OpenBIOS was a lot of assembly code, with a difficult-to-master build structure. Our small community began a search for a better foundation for LinuxBIOS. Jeff Garzik found a new BIOS and learned that STPC, which had written it, was willing to open source it. The STPC BIOS became the code base for the new LinuxBIOS. The STPC code required substantial reorganization so it could support multiple motherboards and chipsets, but it did provide a good starting point.
The next six months were spent getting a few platforms to run LinuxBIOS. Our first non-graphical platform was an Intel L440GX+ motherboard, followed by an SiS 630 motherboard. With the SiS, we got our first corporate involvement. SiS supplied data books, schematics, assembly code and technical support, all aimed at getting LinuxBIOS running on its platform.
We learned what Linux could and could not do. At the time, we were working with kernel version 2.2. We learned that Linux could not configure a PCI bus from scratch—LinuxBIOS had to do that. We were able to take the PCI code from Linux and, with modifications, use it directly in LinuxBIOS, while adding the extensions we needed for true PCI configuration. We learned that LinuxBIOS came up so fast, the IDE drives were not spun up. We continue to support a patch for Linux to work around this problem. These and a host of other lessons required some unexpected changes in our “Let Linux do it!” philosophy.
By the nine-month mark, we had LinuxBIOS working well on two platforms, written mostly in C, and we had the beginnings of corporate interest. VIA and Acer contributed data books that allowed us to port to their new chipsets. That summer James Hendricks began work on SMP support, and in “Let Linux do it!” mode, that support was written as patches to the Linux kernel, not as extensions to LinuxBIOS. At one point, with our patches, a Linux kernel could come up as a uniprocessor and enable the additional processors from scratch—something that heretofore only the BIOSes knew how to do.
That summer, Linux NetworX joined the effort, and to our good fortune, Eric Biederman got involved. Eric's most important early work was the Alpha port. Eric also cleaned up the memory startup code significantly. Our collaboration continues to this day; Linux NetworX is the largest reseller of LinuxBIOS-based systems, and Eric has spearheaded the creation and architecture of version 2 of LinuxBIOS.
That fall, we presented talks at Atlanta Linux Showcase 2000, and while there met Steve James from Linux Labs. This partnership allowed us, in the space of less than a month, to realize our dream: we built a 13-node LinuxBIOS-based cluster for Supercomputing 2000. The cluster booted to full operational status in about 13 seconds.
By 2001, Linux NetworX had completed the Alpha port for the DS10. We then built a cluster with 104 DS10s, all running LinuxBIOS. The DS10 booted more slowly than the Pentium systems, so it took this cluster 50 or so seconds to come to full operational status, a speed that still was quite acceptable. We were used to BIOSes that took 50 seconds simply to test memory.
The Alpha port demonstrated that LinuxBIOS was portable. Little if any of the code changed, and yet LinuxBIOS worked fine as a 64-bit BIOS or as a 32-bit BIOS.
Since 2001, we have added developers (there are now 11) and continued to port to more platforms, the most recent being the AMD Opteron. We envisioned LinuxBIOS as purely for clusters, but now non-cluster use far outstrips LinuxBIOS use in clusters. We thought Linux could do everything hard; LinuxBIOS does a lot now, including SMP startup. We would have preferred to “Let Linux do it”, but the design of the AMD K7 SMP hardware requires that SMP startup be done in the BIOS.
We thought vendors would jump in. It has taken four years, but in this fifth year of LinuxBIOS development, we now are finding some of the largest computer vendors in the world expressing interest. We simply were a little optimistic on the time frame. Once vendors see the business case, however, they get involved. Vendors sold at least $30 million US worth of LinuxBIOS-based systems in 2003, up from $0 million in 2000.