A Geek In Paradise
I had been to Fermilab only the year before, but when the invitation came from Dan Yocum to meet at Fermilab's facility outside Chicago, how could I refuse? I am a geek at heart.
Fermilab is short for “Fermi National Accelerator Laboratory”, located in Batavia, Illinois. It occupies a parcel of land about three miles on each side (see Figure 1), and houses several accelerator rings which generate (in a very concentrated space) amounts of power greater than those found in the sun or any other place in the galaxy, much less on the face of the earth. They use these fantastic amounts of power to collide various particles at extremely high speed in the search for the basic building blocks of the universe.
In ancient days, various philosophers stated that we would eventually find the “smallest particle”, and for a while this was considered to be the atom. In the relatively recent days of discovering nuclear energy, it was recognized that the smallest particle was not the atom, but made up of various other parts such as protons, neutrons and electrons. (Students of physics, please have mercy on me as I try to explain this in words that most readers will understand.) During the last quarter of a century, more and more physicists began to believe there were even smaller particles making up the protons, called quarks and gluons. Quarks (having nothing to do with a resident of Deep Space Nine) are thought to have six different types, and in 1994 the last of these Quarks, the “top quark”, was discovered at Fermilab. Unfortunately, the top quark exists for only a very short (10 -24 seconds) period of time, so it is very hard to collect data on it, particularly when it is seen only six times in a given year of running the accelerator. Therefore, Fermilab decided to increase the size and power of its accelerator, so it could see anywhere from 20 to 300 times the number of quarks. Unfortunately, this would take anywhere from 20 to 300 times the amount of power and generate 20 to 300 times the amount of raw data to be seen by the collectors, meaning 1,000,000MB of data would be generated every second. Yes, that is one million megabytes of data per second.
Of course, storing that much data would be very difficult, but fortunately Fermilab had determined they would be able to filter the information and store a smaller subset of it (only 18 to 100MB of data per second) for later analysis. To do this, they would have to increase the power of their computing systems significantly, and their former model of using expensive workstations in a workstation farm would not have been affordable. Enter Linux.
Last year, when people from Red Hat Software and I visited Fermilab while attending Spring Comdex, I was lucky enough to meet G. P. Yeh, a big fan of Linux and one of the physicists who discovered the top quark. He was kind enough to take us on a short tour of the Fermilab facilities and explain the role of Linux within Fermilab. He explained they investigated Linux and proved that inexpensive PCs running Linux could do the job more than adequately for a price they could afford. They estimated they would need about 2,000 CPUs working together.
This year, when Dan Yocum heard that Linus Torvalds was speaking at Spring Comdex, he enlisted my help in convincing Linus to make a separate trip to Fermilab to speak to the physicists and their families. This did not take much convincing, since Linus has an interest in math, physics and science.
We met at the hotel where Linus was staying, and with a small group of Linux supporters (see Figure 2), drove to Fermilab. It is quite interesting to approach Fermilab, since the land around the accelerator is flat, with only the main building (see Figure 3) rising up from the ground to any height. It would definitely be a great scene for a science fiction movie. We parked the car, went inside and met Dr. G. P. Yeh (who everyone calls “G.P.”).
G.P. took us on an extended tour, beginning with the top floor of the main building, looking out over the collider rings. “As far as you can see in every direction is Fermilab”, G.P. said. It was an impressive sight. He then took us to see the collider detectors (see Figure 4)—“It weighs only 100 tons and cost about 100 million dollars.” Finally, we visited the computer room, where the Linux Farms were going to be placed (see Figures 5 and 6). Fermilab calls their systems “Farms” rather than Beowulf systems. They have master machines that delegate the work to many slave processors, connected by high-speed networking and switches. They are not planning on buying the 2000 CPUs until very close to the time they need them. After all, prices keep dropping and capabilities keep increasing, so why not wait until the last moment to get the best “bang for the buck”?
After the tour was over, we went to the main auditorium where Linus gave his talk. For those of you who have heard Linus give a speech, you know he does not like to talk with prepared slides, but instead gives a short prepared talk, then answers questions. This night was no different, other than the topic and complexity of the questions. It was obvious from the questions asked that the audience had more of a computer science bent than other, more general audiences. Questions regarding symmetric multi-processing and the reality of distributing interrupts over multiple CPUs entered the air.
After a significant amount of time answering questions and signing autographs, our little troupe went to the home of Jeff Gerhardt to enjoy pizza and “refreshments”. We were greeted by smoke rolling out of the front door, reminding everyone it is best to take the pizza out of the box before warming it in the oven. When the smoke died down, some interesting home brew made its way to the front, and everyone enjoyed the pizza and brew (see Figures 7 and 8).
I love this type of computing where people push the envelope of what the human mind can conceive, and I thank the government of the United States for helping to fund such a quest.
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Introduction to MapReduce with Hadoop on Linux
- RSS Feeds
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?