Introducing the Open Cluster Framework

Talking with Alan Robertson (excitedly) about HA, HPC and their future in open-source clustering.

Those of you familiar with the Linux High Availability (HA) scene will recognize Alan Robertson's name immediately. After all, along with people like Harald Milz and Lars Marowsky-Bree, he's one of the main names in HA Linux, a frequent HA contributor and the owner of the linux-ha.org web site. Although his contributions in the area of HA might be well known to the community, Alan has project in his background that is less well known but may prove to be instrumental in both the HA and High Performance Computing (HPC) cluster communities--namely, the Open Cluster Framework project.

The goal of the HA Project is to provide an HA clustering solution for Linux via community development, and the goal of OCF might be even more ambitious: to define APIs that provide basic clustering functions and to provide a reference implementation of the API. Note that these APIs do not extend only to HA clusters, but include HPC clusters as well. What is so ambitious is that at a time when many in the Open Source community are trying to develop solutions around a single project, the OCF is concentrating on trying to unite all the open-source HA projects and unify two majors camps, HA and HPC, that are often thought of as separate entities.

The OCF Project itself is in its infancy. First presented at the Ottawa Linux Symposium in July 2001, the group is in the early stages of defining itself, aligning with various groups and supporters and coming up with a preliminary architecture. The philosophy of the OCF is that although HA and HPC have been largely separate over the years, they share many common clustering problems and would realize advantages by sharing code. As such, the intent of the group is to define and develop building blocks that can be used by the different cluster disciplines to build distinctly unique clusters suited to each cluster's needs. Since OCF is defining an API for the building blocks as well as a reference implementation, the group expects there will be different implementations as well.

I could bore you with my personal spin on the OCF, but I recently had the opportunity to meet with Alan Roberston on his visit to beautiful, sunny Poughkeepsie, New York. If you've ever met Alan in person, he's nothing if not entertaining and animated. At various times during our interview he flapped his arms, jogged in place and slammed his hand on the table, jarring the tape recorder. He's clearly an evangelist for OCF, and he's on a mission to join the various HA and HPC factions.

Richard: Alan, let's introduce you to our readers. When did Linux first appear on your radar screen?

Alan: Well, let's start with UNIX. I first used UNIX in 1978 when I worked for Bell Labs. I think I first started using Linux in 1993. There was a fellow in the office that was big on Slackware, but I liked the advantages of a packaged system. I think it was 3.03 Red Hat that I first ran, with the 1. something kernel.

Richard: And when did you start contributing to open source?

Alan: There's a little bit of a story before that. Being a person that had done a lot of UNIX kernel hacking in the past, I was delighted to find that the code mistook my Mitsumi CD-ROM for a Sony CD-ROM, so I could find the driver and make it recognize my Mitsumi. So, that was my first and maybe my only Linux kernel hack; I've done mostly user-space Linux stuff. My first contribution was the high availability stuff. My job at the time was technology planner for R&D for Bell Labs. Ken Switzer, my second line manager at the time, asked what Linux had for high availability software. I didn't know, so I went away to find out. I found out that they had a mailing list and a developer's HOWTO by Harald Milz on how to write high availability software. I had read Eric Raymond's The Cathedral and the Bazaar, and I had been fascinated by the idea that it captured something important. It wasn't the hacker mentality or the communal developer approach; it was that by traditional models, Linux should be an abysmal failure. It breaks every single known rule of software development. It doesn't have a plan, it doesn't have an architecture, it doesn't have an architect, it doesn't have a waterfall model, it's all done haphazardly [Alan's voice is dripping with sarcasm at this point] and yet, the evidence is, it's been wildly successful. It captured my sense that a lot of software development methods were nonsense.

Richard: Did you relate this idea of the Cathedral and the Bazaar to HA?

Alan: This is a diversion, hang on. I knew a lot of people developing code; they were all good, and none of them followed the conventional software development models. I felt that Raymond got at something important, and the itch it created in me was to participate in an open-source project to experience it first hand.

I went to visit my in-laws over Christmas, when one tends to have some free time, so I brought my laptop with me. It was the perfect HA cluster. How, you ask, can one node be an HA cluster? Because the other node is dead! [lots of laughter]. So while I was at my in-laws for a couple of weeks I wrote the heartbeat module, came back and announced to the list that I had written some software. But it wasn't as much to learn about HA as it was to learn about the open-source process. I didn't do it to become a project leader, I did it to write some software and scratch my personal itch to learn how open-source development worked firsthand.

Richard: How successful has Heartbeat been since its release? Give me an idea where it's been used.

Alan: It's used by one medical imaging company that has to be on-line to give doctors the images they need. It's used at Los Alamos National Lab, where it's really important that your badge readers actually work, where security is especially high, or there's a security/safety issue. My guess is there's probably several thousand real, true production deployments of this software.

Richard: When did you start to think about the implications of other people in the HA space?

Alan: When other people came into the space. I wrote this code because there was none. I would not have written it had there already been code--I needed a good reason to write it myself.

To go on with the story, I got a call from Germany one day, from a Volker Weigand. Volker was a contributor to my project at the time, and he called to tell me SuSE was expanding their staff and was very interested in high availability. Eventually Volker said that he'd like me to come work for SuSE. I was concerned because I didn't speak German, and I didn't drink beer, and I didn't want to move. Well, over time Volker got me excited about it and invited me for an interview in Germany. Shortly thereafter I joined SuSE. Just as I joined them, they announced they were going to partner with SGI, who wanted to open-source its HA package, Failsafe. So now, I was going to help introduce Failsafe to the Open Source community; they didn't want to ruffle any community feathers, and the feathers they didn't want to ruffle were mine! [laughs] So, they wanted me to help them through the process.

Richard: Your experience with HA must have been invaluable at this point.

Alan: That and my experience with open-source development, but things got weirdly personal. At this point I was the head of two competing open-source projects. Schizophrenia is the word that comes to mind to describe my mental state. Now you see how I have the perspective I have--I have heartbeat and I have Failsafe and I'm head of both projects. So, on one hand SuSE wants me to write a component for a reset service for Failsafe. At the time, I needed a reset service for heartbeat too, and I didn't want to write two reset services. So, this is really very personal. Most of all it occurred to me that no one in open source should care how the machine gets reset. It's not a selling point; as long as the machine gets reset, that's all that matters.

Richard: So at this point, you had detected a baseline service that everybody needs.

Alan: Yes, and I'm in charge of two projects that both need it. I looked at Failsafe, and I thought it would be complicated to embed the code right in Failsafe. I thought it would be better to provide the function in a service that only does reset.

Richard: Now at this point, you don't have an inkling about anything called Open Cluster Framework?

Alan: I have an inkling that this is not the only example of needing a component like this, and I had it in the back of my mind. I didn't have a name for it.

Richard: So, reset was chosen for you as the first component, and you realized this is not the last example of a service needed in a lot of different clusters.

Alan: I also noticed that I liked some things about heartbeat better than Failsafe, and I wanted to use my code in Failsafe. And Failsafe had some stuff that I wanted.

Richard: How did you resolve your schizophrenia?

Alan: Well, I created this reset service and put it in heartbeat because I controlled the write access there--I didn't have Failsafe write access yet. So I developed it and tested it for heartbeat, but I kept it completely outside of Failsafe. I said, here's the nature of reset services. You can do things like ask it, "What kinds of machines can you reset?", and reset this computer by name. Anyway, over time it became the de facto standard for resetting computers; there are 12 implementations and it's used in three open-source projects.

At the start, we had one reset component, and my friend, Lars Marowsky-Bree, was working to get SuSE Linux and Failsafe certified as an official SAP platform, which is a big accomplishment, a lot of work. It means a lot in Germany--if you had to pick one mission critical application it would be SAP. So, Lars had a big demonstration coming up in three days, but the only reset device he had ran on 110 power only. So, he had a power switch and I didn't have any specs, and he wants to demo it for Failsafe and heartbeat. Coincidentally someone asked me to go talk to the Atlanta Linux User's Group. So I have this plugin architecture, and someone in Atlanta happened to write a plugin that matched the power switch that Lars had in Germany. And Lars needed it in like three days time, and I found it because I went to the Atlanta LUG! So they put it in, Lars picked it up, and the demo went flawlessly.

My reaction at the end of all this was, this open stuff works! We can accomplish something far beyond what we could have accomplished ourselves. We are working in a way that makes it possible.

Richard: It goes right back to Eric Raymond.

Alan: It goes right back to Eric's observations...let's do this again! [Alan is jumping around the office.] So not long after this, a project called Kimberlite came on the market. Similar to Failsafe, it came to market under the closed-source rules, and they wanted to become open source. Well, they wanted to use our reset solution as well. And they contributed code that Failsafe uses and heartbeat uses.

Richard: So, you've got three groups all working on HA?

Alan: Yes, and we're all going off in separate directions.

Richard: Is this the environment that spawned OCF? When did it formalize?

Alan: Yes, you can see how this is the environment for a framework. So, I was preaching that we should have common components.

Richard: So how did OCF start to grow?

Alan: Backing up a bit, I sat down one day and thought about why the project wasn't going forward. I mean, I was writing things but no one else was contributing. So I thought, why is that? Maybe no one knows what to do. I never sat down and told anyone what to do. So, I took several hours and wrote a to-do list. There were some critical things that I wanted to do, but I put them down on the list anyway, and posted the list on the mailing list. It was 90 minutes later when someone volunteered.

Richard: So this is when you realized the whole open-source methodology works?

Alan: Not only does it work, but look at the speed! I was shocked to see such a quick reaction--someone from Finland mentioned that heartbeat didn't have authentication. I realized that if Linux-HA was going to grow, it required a simple configuration and good security (to protect the users). Security was probably the top thing on my list that I didn't think I had to do myself. So, three or four weeks later he turned the code in, and it was my first big contribution from someone else.

Richard: You found out how useful a to-do list was in running the project--it's kind of like Tom Sawyer white washing the fence?

Alan: Well, yeah, you have to tell them what fence we're painting today. A lot of people want to help; they don't want to spend time finding out what to do, they just want to do something. Part of my job as a project leader is making sure that the right person is matched to the right project. My wife calls it being a good king.

Richard: How would you characterize the state of the OCF today?

Alan: We're at the point where we're doing internal drafts of standards in the two to three key areas we're working in. We've done lots of work to get participation, two or three proprietary vendors, basically every open-source HA project and several HPC participants as well. One of the things I want to point out is that OCF does not have high availability in its title. That is deliberate, because when you have a cluster of computers, regardless of what they do, they all have some fundamental functions. For example, sometimes you want to kill a node in both HA and HPC. What you don't want is both services fighting over who gets to shoot the node. So you need to coordinate that, and you don't want two ways of doing it.

Richard: You only shoot the node once, and only in one way.

Alan: Yes, what you don't want are two contradictory truths because eventually something bad will happen. If you have an HA membership layer, and an HPC membership layer and each of them thinks the membership consists of a slightly different set of computers, something is not going to work.

Richard: Yes, that's patently a bad thing, isn't it?

Alan: Yes, it's a bad thing. But on the other hand, if you have a membership layer that the two of them can share from a single API, then you've eliminated this type of possible error. It won't make everything work, but it makes it possible for everything to work.

Richard: Would you say that the OCF is inclusive, that anyone can join?

Alan: Absolutely, I have gone out of my way, repeatedly, to actively solicit members to the OCF. I would say that the only groups that have actually refused are those that don't have the resources to apply to it right now.

Richard: Is there a hierarchy in the group, and is it formal or informal?

Alan: The group is run informally. The two people that have been pushing it the hardest are myself and Lars Marowsky-Bree. We kind of drew up a map of what we wanted to cover in the standards, and at that point anyone who wanted to participate in each area could. We communicate via e-mail, by voice conferences and by face-to-face meetings. I don't think we do enough with voice and face-to-face meetings.

Richard: Alan, what is the association between OCF and the Free Standards Group (FSG)?

Alan: We are in the process of applying to become an FSG working group. I've followed all the procedures I've been given so far, and I'm waiting to find out what to do next. We are very interested in joining the FSG, and they are excited about the possibilities of working with us.

Richard: Do you feel this affiliation is advantageous to the OCF?

Alan: Yes, first because the OCF is committed to providing standards for Linux. Our open standards are oriented toward Linux; however, we do nothing to preclude them from other versions of UNIX. We try not to do anything to make it unable to run on FreeBSD, for example. So I believe that association with the FSG is good because it is the primary standards body for Linux.

Richard: Tell me, how much buy-in have you gotten from the HPC community?

Alan: With the HPC community we're engaged in a process. There are really two or three major HPC projects that are looking at something analogous to OCF--one is OSCAR and the other is NPACI Rocks. There is mutual interest and an understanding (between HA and HPC) that this might be a profitable way to cooperate. And there are various amounts of effort to make this happen. Everyone is busy doing their own thing, so some of it is learning each other's language so we can communicate. I personally am spending time learning about how the HPC community functions so we can have some kind of mutual terms. So, we're in a socialization process.

Richard: So you'd say the HA group is leading the OCF project, but information is flowing back and forth between HA and HPC, and there seems to be some interest between the leading projects and OCF?

Alan: That's much more concise, yes. I haven't seen a desire to duplicate what we're doing or to dismiss it. So, that's sufficient for now. It's clear that we have different cultures and languages to talk about these things, which adds to the difficulty in joining us all together.

Richard: Being a leader of the OCF, what do you see happening in the next three to six months, and then after that?

Alan: I see us coming out with a real external draft of the standard in the next six or so months. During this time we expect to continue our involvement with the FSG, and then in the next three to six months, to become an official working group. We also expect to see the beginnings of a reference implementation. Beyond the six months, I expect to circulate this draft, collect comments from everyone not yet involved, pick up momentum and get additional attention. During the next six months, I expect some of the HA and HPC work to come to fruition. The HPC world will start to look at the APIs and wonder how to use them.

Richard: Alan, one final question: what did I forget to ask that you'd like to address?

Alan: Two things: clusters are revolutionary, and open-source clusters on commodity hardware are even more so. Our ability to make this revolution happen and reap the benefits for all the people who want to use clusters is dependent on our ability to work together. In that respect, it's very dependent on things like standardization. However, standardization on high-end systems can change things dramatically--the top scientific machines are all clusters. The most cost effective clusters are overwhelmingly open-source clusters built on commodity hardware. They are radically less expensive than their ancestors. If you look at high availability, something similar applies, but people haven't come to realize it so quickly because there hasn't been someone like Donald Becker pushing the idea so hard. So maybe three to five years from now HA will have the same realization as HPC clusters have today, in ways that people never thought to apply them. We have a chance to do 100 times the number of HA clusters than we did before because the cost barriers are down. The potential here is tremendous, and we need to leverage it. Standards are an important part of making this happen.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix