Dirt-Cheap 3-D Spatial Audio
After all of the testing and calibration was completed, we performed two informal, qualitative user tests that would help us validate our new low-cost spatial audio system. The first test evaluated how the new sound system configuration with eight speakers compared with our previous planar configuration containing four speakers. The prior configuration simply used the four speakers on the top of the cube array. We realize that directly comparing these two configurations is somewhat biased, due to the placement of the four-speaker array being located above the user's head. It would be fairer to compare against a four-speaker array located at the height of the user. However, by using the top four speakers, we were able to switch between the two configurations without dismantling our installation.
We performed the experiment by asking a few test subjects to stand in the middle of the immersive room and listen to sounds played for each configuration. We played different sequences of audio on both speaker configurations and made use of the full range of speakers available. The subjects were not told which configurations were being used, nor in which order the pairs of configurations were presented. Several iterations of the pairs of configurations were tried for each subject. After each pair was presented, the subjects rated the two systems. Admittedly, this was not a scientific test, as is evidenced by several unaddressed biases, but all test subjects clearly preferred the eight-speaker configuration.
The second user test evaluated how well the listener is able localize the source of the audio using the eight-speaker configuration. Again, the subjects were tested and each were asked to stand in the center of the immersive room. Each subject was presented with several sounds played one at a time and originating from different positions surrounding the subject. The subjects were asked to point in the direction of the sound source, as they heard it. The visual system was not running, so the users did not get visual cues as to the sound source's location. The subjects were able to localize the sounds with a high degree of accuracy, especially with respect to elevation.
The implementation of our 3-D spatial audio system integrated with our immersive room really enhanced the simulation and training demos we have. Our completed system has improved dramatically the sense of immersion when running the demos. A simulation user easily perceives helicopters and jets flying overhead and a tank rumbling down one of the many streets nearby in the virtual world. The perception of depth from the source of audio is conveyed accurately and also includes doppler effects. Our system is a step above a four-speaker solution we had previously using the Microsoft DirectSound API. It also is a good replacement for the capable but outdated and unsupported eight-speaker solution we had running using another expensive hardware and software platform.
We have devised a true 3-D spatial audio solution that is low cost and has comparable quality to expensive high-end commercial systems. The 3-D spatial audio solution allows sound effects to be generated from all directions surrounding a user, not only planar directions. We accomplished this feat by using only commodity hardware and open-source software. We feel this feature, now available at an affordable price, creates numerous options for game and virtual reality system developers.
We feel our system leads the way for others to devise similar solutions with current and future commodity audio equipment. The developer needs only to purchase a Dolby Surround Sound 7.1 audio card, four pairs of low-cost speakers and audio cables. We spent less than $150 US on hardware—Audigy 2 audio card and audio cables—as we already had speakers available. From start to finish, including hardware and software debugging, configuring and testing, we spent less than a month developing the low-cost 3-D spatial audio system. We feel that using this document as a guide, it should be possible for others to implement this system in less than a week.
Although the system currently meets our needs quite nicely, these features would be nice to add to the 3-D spatial sound API in the future:
Directional sound cones: directional sound cones are a mechanism to provide directional sound with the strongest intensities propagated along the central axis of the cone and weakest toward the edges. Because many sound sources are directional by nature, such as sound emanating from a megaphone, directional sound cones would allow these sound sources to be generated more accurately. Also, because some major APIs, such as Microsoft DirectSound, offer sound cones, it would be nice to offer such a feature.
Additional environmental reverberation effects: although a number of simple environmental reverberation effects are available in the common 3-D spatial audio APIs, supporting more sophisticated effects, such as sound reflection and absorption off of different surfaces, greatly would enhance the listening experience. This is an area of ongoing research, and the Mustajuuri system would be a good testbed for trying out new techniques.
Enhanced sound attenuation level: the current distance attenuation models for sound in Mustajuuri are quadratic by nature, thus they are fairly simple. In the real world, sound attenuation is much more sophisticated and depends on heat, humidity, sound frequency and many other factors. For example, low-frequency sounds generally carry much farther than high frequency. Accounting for these complexities could help significantly in providing distance cues.