Advanced 3-D Graphics: GNU Maverik—a VR Micro-Kernel
GNU Maverik is a system to help programmers create virtual environments (VEs). It was built by an academic research group to tackle some of the problems found when using existing approaches to VE construction. Maverik's main contribution is its framework for managing graphics and interaction, but it comes with extensive built-in functionality to make getting started straightforward. However, Maverik is a tool for programmers—it is not an end-user application.
Maverik is part of the GNU project and is distributed freely with full source under the GNU GPL. The distribution includes documentation, a tutorial and examples. Released in February 1999, Maverik has been downloaded by thousands of sites worldwide, and received positive feedback from both academic and commercial organizations.
In this article, I give some background to the challenges facing the designers of Virtual Environments and then describe how Maverik addresses some of these.
It's easy to get enthusiastic about interactive 3-D graphics, and to imagine great things—walking around inside your dream house; conversing face to face with a distant friend in a shared paradise; or rehearsing complex engineering procedures with professional colleagues around the world in a shared virtual environment.
These things are easy to imagine, but surprisingly hard to do—witness the lack of really compelling examples of this kind of technology in use. So what's the problem? Until quite recently the limitations of computer hardware and VR peripherals were seen to be the limiting factor. In the last year or so, PC graphics accelerators have made dramatic strides forward as 3-D becomes a more mainstream facility. Today inexpensive 3-D PC accelerators rival the performance of the most expensive 3-D workstations, and are beginning to include options such as economic stereo shutterglass support.
The limiting factor—the gap between what we can so easily imagine and what we can readily achieve—is now more clearly exposed as being software. Writing 3-D applications is hard work. It takes hundreds of person-hours (programmers and artists) to create the impressive games and animated film sequences we have come to expect of 3-D computer graphics. Yet film animations such as Toy Story use techniques that are way beyond what can be achieved at interactive framerates. The best of today's PC graphics cards is something like 10,000 times too slow to do Toy Story in real time. That is largely due to the complexity of the modeling and sophisticated lighting calculations used. It is possible to do impressive things in real time, as computer games demonstrate. But games rely upon methods such as texture mapping to give the illusion of complexity within a VE far beyond what is actually present. In a game, the complex metalwork and scenery that you see are mostly stage scenery: you cannot take the girder away from the wall, it's only a picture of a girder.
In contrast, for real engineering tasks, the complexity of the CAD models can be staggering—a model of a real offshore platform can, for example, equate to half a gigabyte of polygonal data which must somehow be processed 10-30 times per second if it is to be any use for interactive work in a VE (see Figure 1).
Let's think about the kind of things that need to be done to work with a model of this complexity; this will help understand the reasoning behind Maverik. To cope with the offshore platform, we need to find ways of rapidly focusing in on only those parts that need drawing at any one instant. For example, in a building with many rooms, you can quickly discard large parts of the model that you know cannot be seen from the current position. In a large “cityscape” at ground level, we could work out which buildings you can see that are not occluded by others, and rapidly discard everything else. So some application-specific tests can quickly discard a lot of irrelevant data. In some cases, those do not work; for the offshore platform, we can see between the pipes all the way across the rig, so we need a different strategy. In the limit, though, we will run out of time to render the current frame, so some bail-out option is necessary. One example would be to fall back to a wire-frame representation for more distant parts of the view (see Figure 2). That does not make such a nice still picture, but if it lets you work interactively within the VE, then it might be tremendously effective compared to a “jerky” presentation.
A promising approach is to replace the wire-frame portion with image-based rendering techniques, filling in the background with appropriate still pictures of the rig. These stills are then dynamically distorted so you do not notice that they are stills. That is a subject of current research. For a VE, it is important to make any such transitions between representations smoothly, and avoid objects “popping” between different rendering styles, which is quite disturbing.
Most of these techniques are well understood individually, but the particular combination of tricks employed to get enough speed are often highly application-specific. Putting together each frame, whilst managing other issues such as user navigation and application behaviour, is quite a challenge. At present, if you want to attempt this, you have two basic strategies available.
First, you could program the whole thing from the ground up in your favourite language and a 3-D graphics library such as Mesa or OpenGL. This gives you a lot of flexibility—you can do anything that's possible, and at the best performance attainable. The drawback is that for a sophisticated task, this can be like programming an operating system in assembler. It's too low a level, and hence hard to re-use what you write for different kinds of applications.
Your second choice is to use a proprietary package for building Virtual Worlds. Such a package will get you going quickly, and hopefully has built-in high-level support for your desires—for example, making your VE sharable between several users. To support these high-level features, the VE package has its own internal complexity and representations of the VE, which is, after all, what you are buying. The drawback here is that the complexity within the VE system may not be appropriate for what you want to do (i.e., a “one size fits all” approach may not work). If you have a complex application, it will have its own data structures and algorithms tuned to that application. In the case of the offshore platform, the CAD system that built it talks in terms of ladders, pipes, valves and so on—quite high-level descriptions. To use this data with the VE building package, it will have to be exported from the CAD system, and imported in whatever format the VE system understands internally. The common denominator is, more often than not, a “sea of polygons”. So in exporting the data into the VE system's general-purpose format, much of the interesting information on what those polygons mean is lost, and with it the potential for exploiting that information to win speed.
A second problem with this approach is that if someone within the VE manipulates the objects (disconnecting a pipe or valve for example), then somehow that has to be communicated back to your CAD application which must then update its own data structures. So you get two representations of the world, one within the application, and one within the VE system, and these have to be kept synchronized.
So an “off the shelf” VE building package does what it does, well. But as the desire for complex VEs increases, we encounter problems of programming complexity, and the inherent performance limitations of maintaining two different models of the data—one of which is probably an unwieldy sea of polygons.