Linux Multimedia with PD and GEM: a User's Report
As a multimedia-capable platform Linux has seen terrific growth over the past few years. At the system level, a simple kernel patch can now improve scheduler efficiency and bring performance latencies down to an incredible three milliseconds or less, well within the acceptable range for professional audio and video applications. Given the low-latency patch (along with some other fine-tuning), we can now consider the availability of applications capable of utilizing this enhancement.
Along with performance artist Chris Spradlin, I am currently working on a multimedia presentation that combines video playback/processing, still-image and video projection, and various forms of audio capture, playback and transformation. Controlling the interplay of the different media poses a considerable challenge, particularly because we want to do everything in real time and in Linux. Happily, I have found an excellent solution to our dilemma: Miller Puckette's remarkable Pd.
Pd (pure data) is an audio synthesis/processing environment similar to the famous Max (and its Java offspring jMax). These environments employ a neat scheme of graphically patching various simple components (such as signal generators/modifiers and their control objects) into complex sonic networks.
Figure 1 demonstrates Pd's basic principles: the osc~ signal generator creates an audio waveform (a cosine wave), the slider controls the frequency (pitch) of the waveform and the network around the dbtorms object modifies the amplitude (volume) of the generated signal. Finally, the modified signal is sent to the dac~ object (the digital-to-analog converter) and the results are heard through the audio system.
Pd includes a variety of ready-made objects for signal generation and processing, and if this kind of synthesis patching were all Pd could do, it would still be an impressive audio environment. However, thanks to Mark Danks' wonderful GEM OpenGL-based graphics library, Pd also can manipulate video and image parameters in real time. Pd's flexibility permits arbitrary connection and control between its audio and video streams, creating a powerful environment for controlling and coordinating multimedia presentations.
In order to use Pd with GEM, you must invoke it via the $HOME/.pdrc file or with a command string similar to this one:
pd -rt -lib /home/dlphilp/gem-0.87_2/Gem
The -rt option prioritizes Pd's performance to real-time status. When coupled with a low-latency kernel, Pd's performance is quite acceptable for live shows and other real-time circumstances. You'll want all the help you can get when you're running Pd with the GEM library; the kernel latency patch is a godsend, but you'll still need a hardware-accelerated OpenGL installation to make the best use of GEM.
Note: the test system for this article included an 800MHz Duron CPU, 256MB RAM and a Voodoo3 AGP video card. The Linux kernel version was 2.4.5, patched for low latency; the video subsystem was XFree86 4.0.1. Certain operations in GEM are very CPU-intensive, and I would qualify the Voodoo3 as the low end of acceptable video boards for the Pd/GEM alliance. The audio system included a Sound Blaster Live! sound card running under the ALSA 0.9.0beta10 driver. Note also that Pd version 0.34-4 (stable) was used along with IOhannes Zmoelnig's beta version of GEM 0.87. Previous incarnations of GEM do not include the Linux versions of the pix_movie and pix_film objects needed for the real-time video manipulations described here.
In Figure 2 we see the basic structure of a simple Pd patch utilizing the GEM library functions. Note that the gemwin and gemhead objects are required for all other GEM-related actions. This patch provides the mechanisms for loading a movie (anim-3.mov in this case) and playing it back while rendering it to the surfaces of a cube. The cube size is controlled by the slider movement, and the film can be started and stopped by clicking on the smaller Bang button (one of the two cyan-colored boxes). Figures 3-5 demonstrate this patch in action. The first snapshot (Figure 3) illustrates the simplest rendering, with one face of the cube displayed and expanded to fit most of the rendering window. Figure 4 shows how the rotate object manipulates the cube, and Figure 5 displays the curious effect that occurs when the slider value exceeds the rendering window size. Of course the still images can't convey the effect of the movie playing upon the surfaces of the animated cube, but trust me, it's very cool.
Similis sum folio de quo ludunt venti.