The Linux Soundfile Editor Roundup
A soundfile editor is the audio worker's essential utility for performing various editing operations to massage and finesse recorded sound. Some of these operations are analogous to those of a text editor, such as cut/copy/paste actions, while others are unique to editing audio data.
This article takes you on a whirlwind tour of soundfile editors for Linux. You don't need to know anything special about digital audio or DSP theory, and if you'd like to try any of these programs, all you need on your machine is a working sound system. But, before taking the tour, let's consider what a typical soundfile editor does and how it's typically used.
Editing audio is best accomplished with a graphical interface. By representing audio data visually, it is simple to locate those parts of the sound that require attention, such as gaps and amplitude spikes. This ability to find quickly where an operation is needed speeds up the editing process. Regions can be marked or selected accurately, and zoom routines let users enlarge or diminish their view of a file at any given point, making it possible to perform large-scale and sample-accurate edits easily and quickly.
A well-designed soundfile editor should include at least these basic operations and capabilities:
Compress/expand time scale.
Sample rate conversion.
Display in different time formats.
Display multiple views of file.
Display multiple files simultaneously.
Independent X/Y axis control.
Find maximum sample value.
Display amplitude envelope in various representations (db, peak, RMS).
Edit pitch and amplitude envelopes.
Print out display of samples.
As we shall see, the editors profiled in this article meet most of these baseline criteria, often adding unique functions and routines.
Table 1 presents a further set of features, most but not all Linux-specific, and outlines how the editors presented here accommodate them.
Table 1. Soundfile Editor Features
|Snd||y||y||y||Motif, GTK ||disk||GPL|
 Can be compiled without graphics.  Works with ALSA's OSS/Free emulation.  Free for noncommercial use, freely distributable.  Free for unrestricted use, freely distributable.
Now, let's consider some possible uses for a soundfile editor. The following list is by no means a complete summary, it merely indicates how I typically employ an editor in my own work:
Trimming excess silence from recordings.
Cutting large files into smaller pieces.
Adding effects such as reverb, chorus and flanging.
Slowing playback speed without changing pitch.
Removing cracks and pops from recordings.
Converting sample rate.
Converting file format.
Equalizing or filtering the sound to make it brighter or give it more bass.
Using the editor as a music and sound composition tool.
Dozens of other operations can be added to this list, and each user will find special uses for a soundfile editor.
Slowing the playback speed of a file without changing its pitch comes in handy when I am teaching. Students often bring in recordings that are not particularly easy to hear or transcribe when played at normal speed. I convert the original recording (CD or MP3) to a WAV file, load it into the Snd editor and slow the playback speed, still keeping the original pitch, until I can hear each note clearly. This makes it easier to write out an accurate transcription. Some editors let you do this in real time, and some allow a defined looping region to be arbitrarily redefined during playback, a very helpful feature.
Normalization raises the amplitudes of a file to their relative peaks, so all amplitude values are raised relative to the peak value. By normalizing my project files before burning them to CD, I balance the volume differences between pieces. Normalization is a common premastering operation in professionally produced recordings.
A soundfile editor also is useful for repairing badly formed compressed soundfiles. Some editors load MP3 and Ogg files directly, while others perform a format conversion and then load the converted file. I can remove excess silence and fix damaged spots in the recording by removing or redrawing them. The waveform display represents damaged spots as broken or incomplete curves. I also normalize and equalize the sound before converting it back to its original format. Converting a lossy-encoded file to a soundfile format and back again to a lossy format typically results in less than optimal audio quality, so I rebalance the sound's frequencies with the editor's equalization tools.
In the space allotted for this article, I can scratch only the surface of each profiled editor. We look at each program's salient aspects, but you should try each one yourself to measure the depths. Let's start this roundup with some of the older soundfile editors available for Linux.
The first wave of soundfile editors for Linux arrived at a time when OSS/Free was the system audio interface and Motif was the most attractive GUI toolkit. They were all designed to compile and run on a variety of UNIX systems, not only Linux.
Bill Schottstaedt has been crafting Snd in one form or another since the era of the PDP minicomputers. However, Snd as we know it dates from late 1996, with Linux support added in 1997. Snd is a remarkable program. It can be regarded as an exceptionally powerful soundfile editor, as an infinitely programmable audio editing toolkit or as a graphic component of the Common environment of music and sound applications. The Common family of audio applications include Bill Schottstaedt's Common Lisp Music (software sound synthesis) and Common Music Notation and Rick Taube's Common Music (metalanguage for music composition). These are all Lisp-based applications that can be configured for complex interactivity. The key to unlocking Snd's power lies in its Guile interface, based on the Lisp-like Scheme programming language. Snd's interface includes a window called the Listener in which the user can enter Guile commands for customizing and redefining every aspect of the program.
The screenshots illustrate how extensively Snd can be configured. Figure 1 shows the Motif version of Snd in its default appearance. In Figure 2, the user interface has been customized extensively. New menus have been added for Snd's internal DSP modules and LADSPA plugins, the background and colors are user-defined, and some complex widgets have been custom-built for some of Snd's effects processors. Figure 2 also shows off Snd's amplitude waveform view along with its OpenGL display of the sound's spectrum, its frequency content. Context-sensitive pop-up menus are available for the different displays as well as for selected and unselected regions of a file.
Snd is my favorite soundfile editor, but I admit that it may not be the editor for everyone. If you learn a little Lisp, you can discover more of Snd's power. Fortunately, Snd comes with exhaustive documentation to ease difficulties in the learning process. Predefined configuration files also are available for quicker and easier customization.
The 1.0 release of Doug Scott's MiXViews occurred in 1995, making it the longest-living Linux soundfile editor profiled here. MiXViews was and is a one-man effort to provide UNIX and Linux with a high-quality audio editor. The project continues to be a solo development effort, and it still provides a high-quality editor.
MiXViews provides a strong suite of the basic soundfile editing functions and adds some features found in no other Linux soundfile editor. Phase vocoding and linear predictive coding (LPC) are digital signal analysis/resynthesis tools more commonly associated with software sound synthesis programs such as Csound or Common Lisp Music. These tools analyze a sound for its frequency and amplitude values and store those values in special analysis file formats. An analysis file can be read by a program such as Csound that gives the user independent control over the frequency and amplitude components of the analysis data before its resynthesis to a soundfile. MiXViews provides a complete suite of its own LPC and phase vocoder utilities; it also can read and edit analysis files created by the Csound phase vocoder.
Figure 3 displays some of MiXViews' graphic tools for editing phase vocoder and LPC analysis data. Although the theory and mathematics behind these tools can be intimidating, the MiXViews interface invites experimentation, making the tools themselves easier and more interesting to use.
If you want to try MiXViews, I suggest using the prebuilt binary. Compiling MiXViews is somewhat tricky, and it requires an uncommon graphics toolkit (InterViews), so simply download the static binary and start using it.
DAP (the Digital Audio Processor) is programmer Richard Kent's contribution to multiplatform soundfile editors. Like MiXViews, DAP's GUI is based on a not-so-new GUI toolkit, the XForms library. Also like MiXViews, DAP's workable soundfile size is limited by your system RAM. In addition, DAP includes some exceptionally well-implemented loop editing tools for AIFF soundfiles. DAP also includes a good selection of DSP modules (extended from Kai Lassfolk's SPKit code) and a handy mono-to-stereo and mono/stereo-to-quad converter.
Some of DAP's editing tools deserve special mention, particularly those found in the Resample and Edit/Mix dialogs. The Resample menu provides pitch and sample rate change with or without time stretching, while the Edit/Mix dialogs (Mix and Mix Range) provide a neat graphic control over the balance of the mixed file amplitudes. The influence of the AIFF file format and its loop support is found throughout the program. For example, when an effect is applied to a file the DSP dialog panel provides a control for the iterations of the sustain and release loops (Figure 4). Although DAP's design is biased toward the AIFF format, it also imports and exports files in RAW and WAV formats.
Alas, DAP is no longer consistently maintained. Its XForms GUI is showing its age, and its file size limitation is a serious drawback. The author is honest about DAP's limitations, but if you're working with AIFF files with embedded loop points, DAP is still a useful tool.
The next group of editors belong to the new wave of Linux audio development. Their natural environment includes the modern graphic interface toolkits and the newer Linux sound system components, such as ALSA, JACK and LADSPA. They also are conceptually more homogenous than their predecessors, offering resemblance to the popular editors familiar to Windows and Mac users.
Audacity is a fitting first representative of this new wave of Linux soundfile editors. It is written in C/C++, uses the wxWindows cross-platform GUI toolkit and supports native and LADSPA signal processing plugins. Recent versions also are JACK-aware, giving Audacity the ability to route its I/O to or from other active JACK-aware programs.
Figure 5 illustrates Audacity with three soundfiles opened—a mono WAV file, a stereo AIFF file and a file in Sun's AU format. Audacity also imports MP3 and Ogg files. It exports Ogg files directly, thanks to the Ogg/Vorbis libraries, but MP3 export depends on a user-supplied encoder. Figure 5 also shows Audacity's native equalizer plugin at work.
Audacity's graphic editing tools are a pleasure to use. Figure 6 shows off the envelope tool's effect on the amplitude contour of one of the soundfiles seen in Figure 5. At the individual sample level, Audacity's drawing tool makes it easy to remove or repair amplitude spikes and other discontinuities.
Like Snd, Audacity features an interface to a Lisp-based programming language, Roger Dannenberg's Nyquist. Nyquist is a language designed for sound synthesis and signal processing, and Audacity's Effects menu provides a Nyquist Prompt that works essentially like Snd's Listener. You enter a Nyquist expression in the prompt dialog, click on the OK button and if the expression is valid, Audacity performs the intended process on the active soundfile.
There's far more to Audacity than I can possibly describe here. Fortunately, the program is easy to learn and use, so check it out for yourself.
With its colorful interface and excellent organization, Davy Durham's ReZound is a pleasure to see as well to use. But eye candy is the least of ReZound's features; the program also provides a complete suite of editing tools, excellent transport controls, some impressive native filters, support for LADSPA plugins and a unique audio remastering/burn-to-CD facility.
Figure 7 illustrates ReZound with three soundfiles loaded and the Curved Balance tool at work on the active file. Curved Balance is one of ReZound's remastering utilities; others include a noise gate, a dynamics compressor and gain and normalization controls. These tools, along with ReZound's other editing amenities, let you massage your sounds to perfection before burning them to CD. ReZound even provides a simple dialog for burning the CD (by way of the cdrdao program) directly from ReZound's File menu.
ReZound's LADSPA support extends to good basic support of the LADSPA VST host plugin (vst.so) from Kjetil Matheussen. This plugin to support plugins provides a usable interface based on WINE for running VST/VSTi plugins under Linux. The vst.so plugin currently is in its early stages, and its degree of harmony with the hosting application varies considerably. Figure 8 shows ReZound employing a VST plugin to apply an effect to the active soundfile.
The most recent version of ReZound supports the JACK audio server, connecting ReZound to the JACK network of intercommunicating audio applications. More JACK enhancements are planned, along with effects previewing, noise removal tools, native time/pitch scaling and many other features and improvements.
At first look, Conrad Parker's Sweep seems much like the other newer editors profiled here. It's ALSA-aware, provides a good basic editing suite, supports LADSPA plugins and shows off a nice modern GTK interface. Sweep offers two unusual tools, though, that give it special value. One is the ability to define multiple regions for nonlinear editing; the other is an interesting little tool named Scrubby.
Defining a selection in a soundfile typically is achieved by placing the cursor at the selection start, then left-clicking and holding as you sweep the cursor to the selection end. This selection method is a common practice employed by all the editors reviewed here. That's how it's done in Sweep too, but Sweep also lets you define multiple selections. Hold down the Ctrl key while making your selections, and voilà, you have multiple-defined selections available for further processing.
An Invert Selection function provides a neat way to create a dialog of effects over a soundfile. Figure 9 illustrates the aftermath of this series of alterations: define multiple selections; apply reverse effect to those selections; invert selection areas (non-selected becomes selected, and vice versa); and apply LADSPA effect to new selections. This is fun stuff and a powerful and creative feature.
Scrubby is Sweep's virtual stylus, essentially functioning as a freely movable playback head with features more typically associated with a DJ's turntable system. Scrubby transforms Sweep into a performance tool, an unusual characteristic for a soundfile editor. A screenshot can't possibly do justice to Scrubby; you have to use the software to see and hear it in action.
The soundfont format (SF2) is a complex soundfile format that includes not only audio data but also data regarding various effects and performance controls. Soundfonts have become ubiquitous in the audio world, finding employment in applications such as Csound, jMax, Pd, Fluidsynth and many others.
If you want to work with soundfonts, you definitely want Josh Green's excellent Swami in your toolkit. Swami is a soundfont-only editor with a well-designed GUI and a wealth of useful features. You can edit existing soundfonts and create your own from the level of individual samples up to the composite instrument. You also can edit and design your own soundfont banks. Other tools are available for setting an instrument's velocity response curves, keyboard zoning (mapping) and modulator routing. Because of Swami's strict focus, I don't have much more to say about it; it is first-rate and is highly recommended Linux audio software.
Kâre Sjölander and Jonas Beskow have developed WaveSurfer to function best in the context of speech research, a domain covering a variety of audio-related disciplines. WaveSurfer is a perfectly useful general-purpose soundfile editor, but its special strengths reside in its tools for analyzing, editing and visualizing the spoken word.
WaveSurfer is written in the popular Tcl/Tk scripting language and widget toolkit, providing the motivated user complete access to the program's internals. Sound processing in WaveSurfer is handled by the SNACK audio functions library, also written by Kâre Sjölander. SNACK itself may be extended by user-defined plugins written in C/C++.
Figure 11 illustrates a simple use of WaveSurfer in speech analysis and representation. The main panel displays the region highlighted in the complete waveform, and the label display indicates the sound's phonemes. The label track is only one example of WaveSurfer's speech-oriented amenities. Others include spectrographic displays, pitch curve extraction and support for a variety of soundfile and transcription formats.
GLAME's developers have implemented an unusual design philosophy in their editor. GLAME (GNU/Linux Audio MEchanics) supplies the expected palette of tools for audio editing, but it also includes a powerful synthesis and processing environment called the filternetwork. A filternetwork provides a canvas on which icons representing synthesis primitives are patched together to create a processing or synthesis chain. Current primitives include oscillators, envelope generators, filters, I/O modules and LADSPA plugins. Once a synthesis network has been designed, it can be run to produce real-time audio or output to a file for further processing (in GLAME, of course). Right-clicking on the waveform display pops up a menu that includes the Apply Custom item. By selecting this item, you can apply your filternetwork to the active soundfile, suggesting some interesting processing possibilities.
Figure 12 illustrates a simple example. The selection in the waveform display has been processed by a filternetwork composed of a gain control, a LADSPA delay plugin and a flanger. The track modules are included as the default I/O ports, representing the original input and the processed audio output.
Olivier Gäumann's Layer-based Audio Editor (LAoE) offers yet another unique design philosophy. An editing session in LAoE consists of building a stack of soundfiles and then opening the desired editing and processing tools for application upon one or more of the layers (soundfiles) in the stack. At first it felt like a rather strange way to work, but after comprehending the program's organization, I began to enjoy its layout and developed a fast work mode with it.
LAoE receives extra points for originality by providing direct editing in its spectral display. A user-defined brush is used to paint over areas for FFT filtering, and the filter itself can be adjusted for finer resolution. Most of this article's editors offer spectral displays, but only LAoE permits direct spectral editing.
LAoE also is the only Java-based editor reviewed here. I've installed Sun's JDK 1.4 on my 800MHz machine, not exactly a fast machine by today's standards, but LAoE's interface was quick and responsive throughout.
Pascal Haakmat's GNUsound is modest in appearance but rich in content. Once again we have a full complement of the basic editing tools, LADSPA plugin support and some special tools for marking, selecting and viewing soundfiles. GNUsound also adopts the concept of tracks, that is, you can designate a number of files for mixdown in a process similar to the mixing process in a multitrack recorder.
Another neat aspect of GNUsound is its implementation of envelopes for effects processing. One of two user-defined envelopes may be selected as control curves for an associated processing parameter, giving a more dynamic contour to your effects processing.
Although GNUsound is intended for use in the GNOME environment, I had no trouble building it under a Planet CCRMA Red Hat 9.0 system and using it in the BlackBox window manager.
KWave has been in development since 1999, so perhaps I should have included it along with the venerable stalwarts. However, its development team has kept pace with its intended target environment, KDE, thus giving KWave a more modern look and feel along with some interesting improvements.
The new KWave has also retained the original's emphasis on graphical tools for processing your files. Figure 15 shows off KWave's low-pass filter editor, complete with a processing preview function. The Listen button loop plays your file or selection while you adjust the filter in real time, a handy feature for testing effect parameters.
Some of my favorite tools, such as the additive synthesis generator, from the original KWave have not yet been reimplemented. Those tools currently are grayed-out in the menus, but the developers plan to restore those functions and add new features. Like GNUsound, KWave's file size is limited to available memory, but it otherwise is a fine editor and is well suited for casual use on the KDE desktop.
I hope this article has stimulated interest in checking out some of these applications. Believe it or not, other soundfile editors are available, though I have tried here to focus on the most popular ones. The Soundfile Editors section of the Linux Sound & MIDI Applications site has a full listing of the available Linux soundfile editors (see the on-line Resources section).
So which one is right for you? It's hard to say. I'm partial to Snd for its vast programmability and to ReZound for its GUI and organization, but you have to try some and see which fit your needs best. Above all, don't be intimidated by the apparent complexity of some of these editors. Approach them as you would The GIMP, testing their features at random until you have a sense of what they can do—don't be shy about clicking on the Undo button. Playing around with this kind of software can be fun and open some interesting creative avenues. And if you come up with some sounds to share, feel free to let me know about it. Now, go forth and edit those joyful noises.
Resources for this article: /article/7506.
Dave Phillips is a musician, teacher and writer living in Findlay, Ohio. He has been an active member of the Linux audio community since his first contact with Linux in 1995. He is the author of The Book of Linux Music & Sound, as well as numerous articles in Linux Journal.