Paul Davis: an Ardour for the Challenge
Paul Davis' JACK sound server is the cornerstone of the modern Linux audio system, and his Ardour DAW (digital audio workstation) is regarded as the flagship application in that system. He also has contributed significantly to many other sound and music software projects, including ALSA, the default sound system for the Linux kernel.
After graduating summa cum laude from Portsmouth Polytechnic (UK) with a degree in biomolecular science, he followed a career course as a software engineer, systems programmer and consultant for various technical firms and institutes (a summary is available on-line at www.ak.tu-berlin.de/menue/edgard-varese-gastprofessur/paul_davis), specializing in UNIX C/C++ environments. In 1994, he became the second employee of Amazon.com, where he developed essential software for the fledgling on-line bookstore. Since 1998, he has been operating Linux Audio Systems, a business venture “...focused on using Linux as a platform for audio and MIDI applications, with a particular focus on real-time performance and commercial studio tools”, as the man himself describes it. Currently, he resides in Berlin where he is the Edgard Varèse Gastprofessor lecturing at the Technische Universitat.
In addition to his reputation as a master programmer, Paul is highly regarded in the Linux audio world as a model for team leadership and the practical management of large-scale distributed development projects. He has attracted a very talented crew of developers for his own projects, and his opinions and views are often sought by novice (and not-so-novice) programmers. The following interview reveals some of the reasons why.
DP: Your education and professional history point toward a career as a programmer in business or scientific fields. When and why did you decide to write music and sound software for a living?
PD: After I left Amazon.com, I felt I wanted to leave programming for good. That was fairly easy at first, because I was busy being an at-home parent to my one-year-old daughter. However, after a couple years, I started to explore music making as a hobby, and I immediately realized that for recording, computers and hard disk recording were the way everything was going. For all kinds of reasons, I wasn't going to use Windows or a Mac, but I decided I could leverage my exposure to both Linux and my ten years of UNIX programming by doing things on Linux. It turned out that the tools available at that time (1997) on Linux were really not up to the task, and I foolishly decided that I would just write my own. This came at a time when I didn't need to make an income, and I basically focused on my own needs and desires, which initially mostly were focused on pattern-based composition and real-time synthesis. Eventually, the need for a reasonable recorder became more and more clear, and at the end of 1999, RME released the first Hammerfall card with 26 channels of input, 26 channels of output and an existing Linux device driver (for OSS, not ALSA). It seemed clear that I needed one of these, and that I should update the driver and then write an HDR application. The driver took a few weeks, and a month or two later, I had a working recorder. It was only a few months later that I and other early users/testers recognized that a recorder by itself was useless and that we needed an editor. Had I known then what I know now about what is involved, I think I would have given up.
To get back to the focus of the question, I didn't really try to make an income from my audio software until about three years ago when my financial situation was becoming precarious. Before that, it was almost entirely a labor of love and a lot of fun. Since then, it has become a bit more of a burden—I try to work reasonable amounts, as people actually are paying me now, and it's very hard to make a living in the US by giving away software! I often think that I (and my family) should move to a much cheaper country, where the income I do make from Ardour would go much further than it does here. Right now, alas, this is not an option for us. And, although I don't like to go on about it too much, income really is a major issue for me right now, with one child about to head off to film school and another likely starting a four-year college in a couple years. It's not clear that I can honestly say that I am making a living doing this, but somehow, I am still here when companies like BeOS, Creamware and so many others have come and gone. It amazes me!
Writing software for musicians and audio engineers has revitalized my love of programming. Helping people connect with their creative selves is a real joy, and facing down huge technical hurdles is very satisfying. The programming I did for 15 years before I started doing this stuff was really just a warm up, both in terms of the satisfaction that I've gotten from it and the pleasure I get from how my work affects others.
DP: What are some of the important differences between writing software for a large-scale Web-based business and writing music and sound software? What are the important similarities?
PD: I don't think there are any similarities other than the need to write good code. One of joys of the way I have worked for the last seven or eight years is that my work is dictated directly by users first and then by my own aesthetics as a programmer. If I decide some code I've written isn't good enough, I am free to rewrite it completely to meet my own standards. Working inside a standard commercial operation, that freedom is hard to find, and your priorities and deadlines are based on marketing realities (and marketing aspirations). One could argue that this delays development and releases, and to a certain extent that's true. But I'm more interested in not facing nasty legacy code problems in five years than I am in meeting someone's idea of when a release should have been finished. The Linux kernel has followed much the same pathway, and it has paid enormous dividends. The kernel code, for the most part, keeps getting cleaner and cleaner even as the kernel gets bigger and more complex. I can't say Ardour is quite as much of a success in this area, but there is a similar motivation.
One day, I would like to know more about the internals of how Ableton Live is developed. I have met the head of Ableton and was very struck by what he told me about how they use a lot of formal software development methods. I have very little experience with these techniques, and I sometimes wonder if Live is a great program because of them or in spite of them.
DP: Project management in the Free Software world has been compared to herding cats. You are well regarded by your colleagues, and it's obvious you know how to control a herd of very talented cats. So, what's your secret? What significant problems have you encountered in managing an international crew working (for free) on a project as complex as Ardour?
PD: The problems didn't really show up in a big way until we started adding MIDI support. This required many deep changes in the code, and since that time, we have two branches of the code under development. We face a real need to keep the MIDI one in sync with the non-MIDI one (where until recently, most development has taken place). There are no tricks to this—just hard work and a “merge early, merge often” policy that sadly I have not adhered to myself. Speaking more broadly, the most difficult problem is one of communication. We use IRC a great deal, and it works very well for us as a loosely connected team. But it has a great personal cost to me, in that I tend to feel a need to be available to talk to people in Finland, Bali, Australia and San Francisco no matter what time of day it is for me.
That said, as you noted, Ardour has some immensely talented people working on the program, and these people generally do not require much supervision. It is true that a much larger part of my days are now spent on communication, but it's really similar to what I read about Linus' role with the kernel. My main role is to foster consistency and appropriate timing of new features than to be the guy who thinks up all the cool stuff (though I'd like to think that from time to time I do still think up some pretty cool stuff).
DP: Have you pursued formal music studies at any time in your life?
PD: I studied flute in middle school, and in high school I used to mess around on our family piano. I also did a lot of tape machine experimental music—myself and several friends were part of a (culturally) very important “cassette swapping” network in the UK during the late 1970s that sanctioned all kinds of weird experimentation and noise making. My early taste in music was dominated by German electronic music, experimental composers and minimalism, which greatly affected what I wanted to try to sound like as a teenage “noise” maker. I tried to learn tabla about 15 years ago, but gave up realizing I was as hopeless as I feared. About seven years ago, I started learning cello, but when I noticed that despite being (relatively speaking) a person of considerable leisure time, I still did not practice till right before a lesson, I called it quits. It gives me a lot of joy that my daughter (now 13) has taken up cello and is entering a phase now where she is starting to sound quite pleasant.
DP: How important is a formal background in music for a programmer who wants to write music and sound software?
PD: Well, in one sense, almost none. Most of the problems when writing audio software are not about audio or music—they tend to be about computers and programming languages and mistakes that you or other people have made. But on a different level, having a clear understanding of the variety of musical structure around the world, and a real sense of how polyphonic music is composed both historically and today, can be really helpful in developing software that is actually useful to real musicians.
DP: What music do you prefer currently?
PD: My digital music collection has nearly 10,000 songs in it, and I have another 800–1,000 vinyl albums. It would be hard to tag it with a particular style, but on inspection, I have more music by ambient electronic musician Steve Roach than anyone else, and by some strange name-similar coincidence, the minimalist composer Steve Reich comes in second (though a distant second). When I work, I often tend to listen to drifting electronic soundscapes, though nearly as often I'm tuned into SomaFM's Groove Salad or Secret Agent stations. At other times, I could be listening to heavy dub reggae, Kurt Elling tearing up jazz standards, Carnatic Indian classical or Soul Coughing.
DP: Ardour 2 is rolling along nicely. What significant development remains for that branch of the Ardour Project?
PD: Not very much is planned right now. I think there will be a 2.5.1 release in the not too distant future [it was released in September 2008], and after that, it will be mostly bug fixes. It's possible we might add a few new features to get us to 2.6, but I don't see going much beyond that with the 2.x series.
DP: Ardour 3 is certainly one of the most anticipated packages in the world of music and sound software libre. What is its current status, and what major challenges remain before a release is scheduled?
PD: There is one immediate challenge, which is merging the changes between 2.4 and 2.5 to the 3.0 tree. A lot of work was done that touches many things that changed independently in the 3.0 tree, and as a result, the merge presents some major challenges. When 3.0 is released, it will have all the features of the most recent 2.x release.
More long-term than getting this merge done, we have three major areas to focus on. The first is getting 3.0 actually working at the same level of functionality as 2.x. This will take a lot of sustained and minor bug fixing to correct small things that got broken along the way. We also need to ensure that 3.0 can handle sessions created with 2.x—right now, this is not the case. And finally, we need to ensure that our initial release is actually useful for people doing MIDI sequencing. I imagine lots of evolution of our MIDI work flow, which is pretty different from most DAWs already, but we need to start from somewhere that is actually useful and not just a teaser. From our early testers, it seems we have that, but quite a few rough edges and small buglets still need to be sorted out.
DP: Besides a well-deserved vacation, what do you envision for yourself beyond Ardour 3?
PD: We have items on the to-do list for Ardour 4 and beyond already. Thinking that far ahead gives me a headache to be honest. My main concern right now is moving Ardour in a direction where I can make a reasonable (US) income and getting 3.0 out to the public.
DP: JACK also has developed well, and I understand the jackdmp project eventually will replace the current jackd. What improvements will be seen at the programmer's level and by the normal user?
PD: For the programmer, almost none whatsoever. Jackdmp is both ABI- and API-compatible with jackd, so you can compile JACK clients against either one and run them against the other. This is entirely by design—it's not an accident, and it will not change. For developers of JACK, jackdmp (in C++) is a much cleaner codebase than jackd, and this is one of the major motivations for switching to it in the future. JACK, by design, is rather complex software, but the jackd codebase (in C) has way too much hackery and not enough clear design to be a good base for us moving forward. Jackdmp's internal code design, apart from just support for multiple processors, is much more modular, much more object-oriented and basically easier to extend and even to comprehend.
For the user, jackdmp offers multiprocessor support for parallel audio processing configurations (which are actually rather uncommon), and in the future, it may offer ways for users to get all their processors utilized by JACK even if the audio processing configuration has no inherent parallelism. Other than that, we hope users will not see a difference in the basic functionality. That said, other areas evolving rapidly in the jackdmp codebase are ways to start, stop and otherwise control JACK, and this will be of great benefit to many desktop users once we release it.
DP: What are the most important aspects for programmers and users of the new JACK MIDI capabilities?
PD: For programmers, the main benefits are first, a way to send MIDI data between applications with sample-accurate timing, and next, a consistent API for both audio and MIDI. There is one effect that can be a bit of a complication—the consistent API means audio and MIDI data arrive in the same thread. This is a rather fundamental change from other systems where the audio and MIDI APIs are quite distinct and tend to lead to applications handling audio data in one thread and MIDI in another. It's an issue we're still working on in Ardour 3.0.
For users, the main benefits are hopefully just better ways to connect independent applications and get maximal sync between them. This can be tricky to get right at present, for reasons largely out of the user's (or programmer's) control.
DP: What advice do you typically give to someone who wants to get into programming music and sound software libre?
PD: Please don't start a new project! Find some software that does something useful for you, or close to what you are thinking about, and consider improving it/changing it/redesigning it. Far too many new developers massively underestimate the work involved in creating a useful, final version of any audio/music software—I know I did, by an amount measured in years! There are a few exceptions. Rui Nuno Capela's QTractor Project is progressing at a pace that is embarrassing to me, and little tools like tuners and so forth can be quick to develop and finish. But please, no more MIDI trackers, MIDI sequencers and so on until you've done some work on the existing programs and have established beyond all reasonable doubt that they cannot be bent to your will.
DP: Do you have any general advice for users of music and sound software libre?
PD: No advice, only apologies for developers like myself using them as Guinea pigs and testing platforms. Oh, and be patient. Oh, and send money!
DP: What do you consider to be the greatest strength(s) and greatest weakness(es) of Linux audio?
PD: I believe that from a technical standpoint, Linux is still a far superior platform for audio than anything else out there. Not only does it have better performance for just about anything that matters, it also has a suite of development tools that make a developer's life much easier. Many OS X developers love XCode on that platform, which is great but locked into the Mac. And, where is valgrind when you need it? When I started working more intensely on the native OS X version of Ardour, I held OS X and Apple in very high esteem. Their user interface work has always been exceptional. Sadly, I have to admit that as I have gotten deeper into OS X as a developer, I have become less and less impressed. The things that are really great about Linux from a developer's perspective are just not there—most of all simplicity and transparency. It also helps that I helped design JACK, the audio I/O framework that most pro-audio and music apps on Linux use—I am working in a style and paradigm I can truly call my own (though I'd like to note that many other people have commented on its simplicity for developers and its power for users).
However, useful software for noncomputer-centric users means a lot of work for someone. And a lot of work implies a lot of time. In our culture, a lot of time generally means a lot of money, one way or another (inheritance, windfall or income). The Linux audio ecosystem has not yet found good ways of generating this money, and as a result, our software (my own and that of many Linux audio developers) lacks some of the things that can be found in proprietary software for Windows or OS X. I say that there is a direct relationship simply because of the time = money equation. If I could pay the right three people to work on Ardour, it's hard to imagine what we could not do. Ableton employs many, many more people than that and it shows in their software. Look at something incredibly basic like tempo-sync'ed LFOs in a plugin or softsynth. I don't know of any Linux audio software that does this, but it's been in proprietary software for at least five or six years.
So, we have a set of really, really excellent tools for users with a technical/computer-centric perspective on their work. We have not reached the same level of accomplishment in terms of providing tools for people who don't want to understand how computers work. And by comparison with a tool like GarageBand, we haven't even done very well at tools that hide the assumptions about how “professional” audio software is supposed to be used (GarageBand hides this very, very well, though at considerable cost to users who develop sophisticated needs rather rapidly).
DP: What else goes on in your life that you'd like readers to know about?
PD: Moving to Berlin to teach for a semester at the Technical University within a week of moving our son off to Los Angeles to go to film school has to be one of the highest stress things I've ever been involved in.
Dave Phillips is a professional musician and writer living in Findlay, Ohio. He's been using Linux since the mid-1990s and was one of the original founders of the Linux Audio Developers group. He is the author of The Book of Linux Music & Sound (No Starch Press, 2000) and has written many articles on Linux music and sound issues for various journals and on-line news sites. When he isn't playing with light and sound, he enjoys reading Latin literature, practicing t'ai chi, chasing shar-pei puppies and spending time with his beloved Ivy.