The Story of OpenAL
At Loki Entertainment Software, we deal with lots of different kinds of software. From public domain, to BSD-licensed source, to Free Software under GPL and LGPL, to proprietary code under contract and NDAs, to closed source hardware drivers, we encounter it all on a daily basis. Moreover, we also write a variety of source. The games we port to Linux are not open source, but our own work is available as free software in projects like part of projects like Fenris, Setup, SMPEG or SDL.
Crossing the fences from all sides over and over again, and getting to know both the concerns of free software and open-source advocates as well as the concerns of publishers and developers, our priorities have changed. As important as open-source and free software are, one of our projects in particular developed out of the recognition that there is something even more important: open, well-documented standards. That project is OpenAL, and its history so far is a fair bit of education on how important standards are—and how difficult it is to create them.
With the Linux ports of Heretic2 and Heavy Gear 2, Loki encountered engines that used spatialized sound above and beyond left-right panning. Heretic2 for a time served as a showcase for audio hardware manufacturers: Aureal's A3D and Creative's EAX, specifically, were both supported. Both competitors at that time had ventured beyond the DirectSound3D feature set and were following very different approaches to spatialized sound. Windows game developers, caught between three different APIs (not counting the large number of commercial audio SDKs offered for Windows), more often than not chose to ignore the issue altogether and refrained from implementing advanced features (unless they decided to go with one of the SDK offerings that promised to provide a convenient abstraction). The manufacturers stepped up and in many cases guided the developer's hand in adding EAX or A3D support to their engines. Spatialized 3-D sound became a checkbox item, a feature to be listed on the game box, and for a while the Windows game developers were offered an easy ride.
Unfortunately, the situation for Linux users was quite different. Driver support for the SB16 de facto standard had stabilized, and PCI sound cards, all in all, worked reasonably well under Linux. Yet, no ALSA or OSS-proprietary or open-source support for the most advanced features and boards was readily available.
Developers targeting Apple users found themselves in a similar situation—while the multimedia APIs on Mac OS might be more sophisticated and complete than the Linux solutions, they were even less compatible with DirectSound and equally oblivious to EAX and A3D features. Windows developers who cared about portability faced an audio feature set full of shining promises, but without a portable API. Given the vastly different approaches of the two main competitors, things were bound to get worse—at a time when Microsoft didn't even provide DirectSound3D for the NT platform that many developers preferred.
Creative, a dominant force in the audio hardware market, found itself challenged by Aureal, a company that attempted to take its experience with high-end spatialized sound solutions into the PC consumer market. Aureal proposed “wavetracing”, a proprietary approach to generating spatialized sound by simulation of sound propagation and acoustic properties of a scene—essentially, by submitting scene geometry to the sound driver. It is important to understand that to a much larger degree than graphics, audio technology was, and still is, software-based. Both the viable market for SDKs, like Miles Sound System or QSound's QMixer, and the evolution of the competing hardware drivers are clear indications of this. Aureal's advantage was their knowledge of simulation, the calculation of early reflections and a lot of optimization work for their drivers. Their disadvantage was the PC consumer market, an environment in which cheap speakers are added to budget solutions as an afterthought. Moving high-end audio processing out of the laboratory and away from the individual Head-Related Transfer Functions (HRTFs), on a desktop where most users and most developers didn't really spend much thought on spatialization, proved to be extraordinarily difficult and, as we now know, ultimately fatal.
Despite operating at diminishing returns for all their efforts, Aureal managed to accomplish one thing: to awaken interest and tentative user demand for better 3-D audio in games. For early adopters, it became a selling point, and publishers and developers followed eventually. Creative, which had their own road map to future audio features, countered by emphasizing their own extended feature set—a much simpler solution focused on stochastic reverberation to emulate the effect of late reflections. In a way, the early and late reflection solutions are complements but the API's and implementations were fundamentally different, and Creative had picked well: EAX was easier and simpler to use, and for all the lack for simulation accuracy, stochastic reverb proved to be a much more effective audio cue for the noisy, low-end speaker desktop.
Both companies were considering Linux initiatives. On one hand, Aureal had a complete software implementation of “wavetracing”, eventually marketed as “A2D” to support the A3D feature set on their competitors' sound cards that could have been the base of a proprietary or even open-source sample implementation. On the other hand, Aureal's code base was a heavily optimized assembly, and a Linux port was not a foregone conclusion. Creative, on the other hand, recruited Linux coders to develop drivers in-house and looked for ways to support EAX under Linux. Both APIs were written with respect to Microsoft's DirectX and COM, but Aureal's geometry-based A3D API had also been inspired by the OpenGL API. This was one of many examples on the influence of OpenGL on the development of OpenAL.
Enter the game developers.
Even before the days of EAX and A3D, some game developers were looking for a portable sound solution. The OpenGL-GameDev mailing list, which hosts more than its average share of developers interested in portable code, spawned in 1998 a list dedicated to discussion of a new, open audio library—tentatively named OpenAL. Proposals were written and posted, and several developers spent significant time coming up with a good solution. However, it quickly became apparent that between us, “good” was measured by very different metrics. Some of the subscribers were mainly interested in music and composition, solidly grounded in a sound synthesis background. Others cared only about spatialized sound, as it was in the days of DirectSound3D. Some preferred explicit, sophisticated OOP design in the API, while others wanted solutions that were less assuming. For all those efforts, little more than a few headers and a white paper were written, and over the months the project lost focus and tapered out. Lack of consensus and maybe a lack of experience in cooperative efforts contributed to this early failure, but to a much larger degree it was the lack of precise objectives, and the sheer variety of “audio” features to deal with, that doomed the first OpenAL effort not once, but twice. In early 1999, a resurrection occurred, coinciding with a brief mention of OpenAL in Jonathan Blow's roundup of Sound API's and SDK's for Game Developer Magazine (May 1999). Published as a sidebar, Terry Sikes (who had written the 1998 white paper) and I stated the objectives of OpenAL. At that time, Aureal was considering a revised, portable A3D API, and our statement emphasized interoperability with OpenGL as a desirable property for a portable audio library, especially if processing geometry. Neither Aureal nor the OpenAL mailing list made significant progress in their separate efforts, and it wasn't until late 1999 that another company decided to get involved: Loki.
Work on Heretic2 and Heavy Gear 2 was commencing at that time, and while we were not (and, truth to be told, still are not) in a position to support EAX or A3D, it became obvious that some solution was needed. The minimum we needed was a Linux implementation of the feature set of DirectSound3D: distance attenuation and Doppler Effect, spatialization in 3-D, pitch and directional sound cones. Beyond that, it would obviously be desirable to talk to IHVs like Creative and Aureal and nurture their tentative interest in Linux drivers. Loki approached contributors to the original OpenAL discussion, set up a mailing list, committed one developer, the esteemed Joseph I. Valenzuela, full-time to the software sample implementation, and the three of us set out to work. Michael Vance, the lead coder on Heavy Gear 2, and myself (working on Heretic2) revisited the original OpenAL discussion and the existing proposals. A number of decisions, in particular the commitment to OpenGL-like conventions and API design, were taken with an eye on the past failures. Of course, the problem domain “audio” is very different from graphics by any account, but a lot of discussions were avoided by applying GL-like syntax wherever applicable. Initially, we expected a good deal of the API to handle geometry in explicit ways, just as A3D did, but we also applied the same reasoning to other parts of the OpenAL interface. Mimicking GL only got us so far: OpenAL is essentially a scene graph API, with a lot of explicit objects. We deviated from GL on other accounts as well: OpenAL has less entry points and more tokens, which has advantages for changing and deprecating API mechanisms while preserving ABI backward compatibility. We adopted a separation into AL, ALC, ALU and ALUT libraries (in loose analogy to GL, GLX/WGL, GLU and GLUT) but almost exclusively focused on the core AL API. Short-term itches led to immediate scratching, and Heavy Gear 2 was heading toward being the first Loki game to ship along with the first OpenAL sample implementation. Even at that early stage, our OpenAL maintainer found himself in valiant struggle to keep the library in sync with the ever-changing specification drafts.
It was well before the Game Developer's Conference (GDC) in March 2000 in San Jose, that Creative expressed interest in OpenAL. To complement their own Linux driver work, and with respect to portability and acceptance, a vendor-neutral, OS agnostic API had become increasingly attractive for IHV's. Microsoft did not exhibit interest in extending DirectSound3D, and the Interactive Audio Special Interest Group (IASIG), which had published Interactive 3-D Level 1 and Level 2 guidelines, did not consider itself in the business of specifying API's. Loki had tried its best to keep the implementation and the specification suitable for purposes beyond our short-term needs, and we were quite serious about the “Open” part of the project, but the interest expressed by Creative and others was a surprise that changed the rules. Beyond helping us in getting Linux ports of games feature complete, OpenAL now had IHV backing that might eventually establish it as a standard.
Loki and Creative announced their initiative during GDC, and we published the initial specification draft written by Michael Vance. Since then, the scope and requirements have changed quite a bit, and a lot of additional work was done during the last half-year. While our original focus was solely on 3-D spatialized sound, we now have to support stereo and multichannel formats, some of which (like MP3) are quite proprietary. Compression and streaming proved to be difficult issues. More than one solution was implemented, and it was a suggestion from Ian Ollman on the OpenAL mailing list which led to the adoption of buffer queueing to handle streaming audio. The more actual and potential features we had to account for, the more paranoia had to be applied to the API. Backward compatibility was broken not once, not twice, but several times, with extensions added to keep deprecated features available for the games we had already shipped: Soldier of Fortune followed Heavy Gear 2 and was followed by SimCity 3000 Unlimited. By now, Unreal Tournament uses OpenAL as well, and upcoming ports of games based on the UT license will inherit this feature. Cognitoy's Mindrover uses OpenAL under Linux, and they, as well as a few other Windows developers, await the release of Win32 OpenAL drivers with hardware support. At the moment, practically all games that use OpenAL do so under Linux, but other platforms will hopefully follow soon.
Between three of us at Loki, and Jean-Marc Jot, Garin Hiebert, Carlo Vogelsang and others from Creative Labs, the OpenAL specification has progressed to a final draft of the version 1.0—a reasonably complete, solid foundation that covers the DirectSound3D feature set, at the same time deviating from it in many details. For example, buffers are strictly separate from sources and are shared among them. Some of the differences are rather subtle, like the distinction between clamping and reference distances, or the explicit definition of the reference velocity used in Doppler calculations. In some cases, like the handling of multichannel data, we still strive to get away with a minimal extension to the API.
To a degree, the biggest accomplishment at this stage is not so much the Specification Draft itself, but what we have learned in the process. While we are still far from finished, we have established a work flow and a reference for RFC's, proposals, discussions and revisions. We borrowed a little from the IETF and quite a lot from the way OpenGL is extended and modified. The toolchain we currently rely on—the SGML version of the DocBook DTD, and the respective Linux parser and formatting tools—comes with a couple of loose ends, and while the document has become more modular, the layout and rendering leaves a lot to be desired. Extensions have been proposed for various features, but implementations have not yet been provided. Our decision to follow the example SGI set with OpenGL—in several senses—has preserved the initial momentum that vanished from the earlier OpenAL efforts. In a way, choosing the name “OpenAL” has to be understood as a clear commitment to a vendor-neutral, open solution, and I think we have kept the promise.
The next milestone will be the official closure on the final draft of the 1.0 specification, followed by releases of hardware OpenAL drivers for Windows, Linux and other platforms. To fulfill its promise, OpenAL will depend on efficient driver support, a task that has to be addressed by Creative and other IHVs. Loki encourages IHVs to try open-source solutions, but the ultimate decision is theirs. This is a necessary side effect of an open standard—just as every free software project has the right to implement the OpenAL API, companies can choose to implement proprietary versions. Given the relative importance of optimized implementations and improved algorithms for audio SDKs and drivers, it will take time to make a case for shared infrastructure and open source.
Beyond the current specification, we have now a much clearer roadmap for the OpenAL 1.1 revision. The feature set defined by the IASIG I3DL2 guidelines, which are largely based on submissions by Creative, will be added as a vendor-specific extension, aiming for a vendor neutral solution in 1.1. A lot of feature requests and additions have to be reviewed, as issues we set aside for OpenAL 1.0 have to be resolved. Given the small amount of users at this time, we have already gotten some rather unexpected and quite surprising inquiries. The context handling API, ALC, will have to prove itself on several platforms before we can confidently expect that, unlike OpenGL, we will have a portable solution at the context/device level as well. The API, currently minimizing entry points at the expense of tokens, might find a different balance once the semantics are found stable. Meanwhile, we continue to port new games, and while each one has its own different idea about resource and state management, we find the existing API to be quite sufficient. Aside from new, orthogonal features like microphone capture, the majority of the work required is related to streaming and additional codecs. Ironically, the API road map does not aim for handling reflection and occlusion geometry at this point. While it still might make sense to explicitly support large reflectors, the game engines are usually much better equipped to determine obstruction and occlusion, and our objective for OpenAL has shifted to allow the application coder to express the consequences of such conditions. The battlefield has shifted to questions about use of OpenAL for composition instead of straightforward spatialization, generalization of vendor-specific features or abstraction of proprietary solutions. HRTFs, initially considered to be outside the scope of AL and better left to device configuration and driver implementation, have emerged as a possible addition, based on Sensaura's work in this area. Support for Dolby, which was on Aureal's road map, would probably be desirable as well. The implications of the Khronos Initiative for streaming multimedia, spearheaded by SGI, with respect to OpenAL has to be clarified—like OpenGL, extensions for interoperability might be added to OpenAL in the future.
Beyond the design and coding work, OpenAL's organization will have to evolve. What is currently a loose cooperation between volunteers, with a few companies thrown into the mix, will have to be established as a nonprofit organization. Again borrowing from OpenGL's example, there will be an architecture review board, a voting process and all the other structures that come with incorporating open standards. Unlike OpenGL, the sample implementation and the eventually written conformance test suite will be open source. Much like OpenGL, the use of the trademark will be restricted to those who passed the conformance test in some authorized, official way.
The importance of all this is exemplified by Brian Paul's Mesa: like no other free software project I can think of, his clean room implementation of the OpenGL API has, over more than five years, led to the hardware accelerated, 3-D-graphics-capable Linux machines we have today. It was Mesa that provided the framework, and 3dfx's Linux Glide that provided the driver, which brought hardware rendering to Linux (Quake) gaming in the first place. Without SGI's diligent work on their OpenGL specification, and their support for Brian Paul, we might not have the DRI open-source solutions for Linux and probably would not have the proprietary NVidia drivers either. If anything, we want OpenAL to match what OpenGL accomplished for Linux and other non-Windows platforms. For Loki, the most important goal now is to get hardware support on Linux—and Windows: ultimately, we want our fellow game developers to choose open standards over closed ones.
Was it worth it? We did get more on our hands than we bargained for, but if the work we have done so far helps in establishing another portable API, the answer is a definite yes. Linux has only just started to face the issues of standardization and certification, and personally, my respect for the IETF has multiplied several times. The Linux Standard Base (LSB) definitely has their work cut out for them and will need all support it can get. Beyond the LSB, the Linux Desktop will need a Multimedia API supplement and Special Interest Group to give ISVs and IHVs alike some ground to stand on—not to mention all those free software and spare time projects struggling to be portable. It might well be that Linux hackers and users alike underestimate the importance that the DirectX set of services had for Windows at large and games on Windows in particular. It is also possible that many Linux developers still underestimate the importance of the desktop. However, Linux is only one of many platforms that need a comprehensive set of portable multimedia APIs. If OpenAL succeeds in defining another building block for such an OS agnostic “OpenX”, we will have accomplished much more than we initially expected.