Controlling Creatures with Linux
The Jim Henson Company is well known for creating characters. Low-tech characters like the Muppets don't need much technology, but animatronics, from gerbils to dinosaurs, do need it, not to mention our 3-D computer graphic puppets. Performing live, in real time, so they can interact with human actors and be captured on film, these characters have a curious set of needs from a technology perspective.
One of Jim Henson's original performance goals was that one person should be in command of each character, bringing a spontaneity and personality harder to achieve in a “performance by committee” (where several people perform a puppet together). The fascinating thing about a creature that achieves this goal is that people forget who or what is controlling it and simply interact with it. Actors and audience alike start conversing with a dog or a frog or a snowman as if it were human.
With the proliferation of servo motor technology in animatronic puppets in the early 1980s, managing increasing numbers of servos became a challenge, so computerized control systems were designed. During the last 15 years, several generations of control systems have been developed at the Jim Henson Creature Shop, including a version that won a Technical Achievement Academy Award in 1992. The latest Henson Performance Control System (HPCS) encompasses the best features of previous systems, while adding new technology available only with today's hardware and computing environments such as Linux.
This system was begun under the guidance of Computer/Electronics Supervisor Jeff Forbes in early 1998. We had a vision that one system on a standard architecture could service all the company's needs. Steve Rosenbluth joined the project at that point as the control system designer, and Michael Babcock as the multimedia programmer. Our needs turned out to be rather expansive, and Linux seemed to be the only thing that could do it all without flinching.
The system has to support two back ends: one animatronic and the other computer graphic. So, our puppets are either real-world robots, or virtual puppets made of polygons and pixels. We can handle either separately or both at the same time.
Once the software “set up” of a puppet is in the system, even puppeteers new to the technology can jump in and perform well within hours. Using the input devices is akin to performing a musical instrument. At a certain point, the puppeteer, like the musician, no longer has to think about what he's doing—he just performs.
PCS setup for animatronic puppets, showing input controls and Control System laptop displaying the GUI.
Henson input devices are not motion capture technologies. Motion capture is both directly analogous to the performer and largely is nonprogrammable. In motion capture, a performer's arm simply corresponds to the creature arm, a knee corresponds to a knee, etc. The performer cannot enhance or reprogram these relationships. The Henson input scheme is both non-analogous and user-programmable. Our input devices are abstractions. For example, a puppeteer's index finger might proportionally control the sincerity or sarcasm of a creature's entire face. And a puppeteer can reprogram puppet movement easily between and even during performances. A person in a motion capture suit would be hard pressed to perform an octopus. A person operating our control system could take it in stride.
At the core of the system is a Control Computer, running RTLinux, which processes and distributes motion data to puppets. The process that runs motion-mixing algorithms on the Control Computer is called the Motion Engine. Steve Rosenbluth wrote it in C++. Performer movement, coming through input transducers from the outside world, passes through the Motion Engine on its way to networked puppets. Inside the Motion Engine, various algorithms are performed on the live data, resolving final actuator positions. An actuator is like a muscle of the puppet; it could be an electromechanical or hydraulic servo in an animatronic or a “virtual servo” (a mesh deformation) in a computer graphic puppet. Motion-mixing relationships are configurable in software to be one-to-many or many-to-one, and compound mixes can be performed on top of those. Physics, which add effects like weight or smoothing, can be added to performance data as it passes through the Motion Engine.
Motion-mixing algorithms are served up via software tools provided by a process called the Tool Server. The Tool Server establishes the relationships between live performance data and the algorithms the Motion Engine will perform on it, using direct access to objects and datasets in shared memory. Designed by Michael Babcock, it is, as the name implies, an asynchronous server that speaks with graphical user interface clients that connect to it.
The GUI is a “not-so-thin” client that connects to the Tool Server via a socket. Michael Babcock wrote the actual application in GTKmm, based on an original high-level interface design by consultant Robert McNally. The client/server architecture allows multiple instances of the user interface to be run either by technicians on other machines or by additional puppeteers.
There are times during production when a technician assists puppeteers with administrative tasks, so a networked GUI means the technician can do those tasks without kicking anyone off the system. A custom protocol is spoken between the GUI and Tool Server. The ability for a technician to administer our Control System application remotely is of great value to us, as production locations may be anywhere around the world, and support staff back at The Creature Shop can have complete access for the purposes of debugging or assisting.
Since Linux itself provides mechanisms for remote access to the rest of the operating system, a technician back at The Creature Shop can give us unparalleled support for systems on the road.
As you can see in the schematic diagram, the back end of the control system, beyond the Motion Engine, can vary. We can perform animatronics with embedded processors in a remote onboard computer (ROC) scenario, and we can perform 3-D Computer Graphic puppets with a Viewer back end.
In one 60Hz frame, the motion engine will obtain an analog to digital converter (ADC) snapshot of our physical input devices, execute motion-mixing algorithms based on the character setup and transmit the data to ROC clients.
Our animatronic ROCs are embedded PCs running DR DOS. Steve Rosenbluth wrote the ROC code in “somewhat object-oriented C” for speed, as it is a fairly lean-and-mean piece of code. The ROC protocol allows for multiple devices on the communications link, which is currently RS-232. This aging interface is not as high-bandwidth as modern networking hardware, but it is the easiest to use over fiber, copper and radio. RS-232 is a hard real-time interface with predictable triggering and latency, which we need. We use plastic fiber optics to transmit RS-232 when a creature is tethered and spread spectrum for wireless creatures.
The computer graphics Viewer is the software module that renders and displays a computer graphic (CG) puppet on a computer screen. The Viewer represented in the schematic diagram can be one of an array of CG modeling packages or game engines that can render quickly enough to display an OpenGL scene live. Computer graphics puppeteering at the Henson Company was pioneered by Digital Supervisor Hal Bertram in the London Creature Shop in the early 1990s. More recently, some of the PC-based CG applications for which Michael Babcock has written control system plugins include Discrete's 3-D Studio Max, Side Effects' Houdini, Kaydara's Filmbox and Alias|Wavefront's Maya.
A control system Actuator, in the CG realm, is scalar channel data that can move a 3-D mesh deformation or a blend shape. A UDP network connection from the Control Computer's Motion Engine streams live motion data into the Viewer. Viewers behave as ROC clients from the perspective of the Motion Engine, in that they speak the same ROC protocol.
With dual AMD Athlon machines, we get frame rates above 100FPS, enough to put multiple puppets in one scene. Character Technical Director Jeff Christie helped complete the picture by perfecting a 3-D character setup that gets fast, lifelike performance from our CG models.
The Viewer supports connections from multiple motion engines, which means, for a scene with multiple characters, that each can be performed by its own control system and puppeteer, sort of like a networked game.
Motion and sound can be recorded and played back by a nonlinear multimedia editor called the Recorder, developed by Michael Babcock. The architecture, designed by Michael and Steve, consists of a multithreaded process networked to the Motion Engine via UDP. The Recorder is synchronized to the Motion Engine because it slaves off its output as an ROC client, yet the Recorder also streams stored data back to the Motion Engine for broadcasting to other ROC clients. This networked structure allows each process to have its own timing and I/O requirements, without interfering with the other, as in the Tool Server/Motion Engine relationship.
Because recorded motion can be cued and played back live, the puppeteer can layer a performance, as one would produce a multitrack audio recording. This is particularly useful for lip-sync scenarios, where the performance of a creature's mouth can be perfected off-line, then played back while the puppeteer performs the rest of the character live.
Dan Helfman contributed a sound recording facility to the SDL, the open-source multimedia API we use in the Recorder.
A module within the Motion Engine called the Link Supervisor can broadcast and manage connections with multiple ROC clients, regardless of their network type or implementation. The result is that one puppeteer can control multiple puppets in multiple mediums. For example, an animatronic cat can be performed at the same time as its computer graphics counterpart. While the body and face of the animatronic is captured on camera, a computer graphics mouth, performed simultaneously, can be viewed live on a monitor or even composited live with the camera tap image on set.
This allows each medium to do what it does best. We get the complex lighting and physics of a “real” creature on set, and CG mouth data can be further finessed in postproduction before compositing with the film plate. This live previsualization allows a director to direct truly the creature performance on set, while allowing actors to interact with their creature costars.
There is a purposeful division between the Motion Engine, the Tool Server, the GUI and the Recorder. Because the more complex multimedia and networking modules require software techniques that might compromise process timing or stability, an architecture was designed by Steve Rosenbluth and Tim McGill that builds a wall around the Motion Engine. The goal was for the Motion Engine to have a minimal amount of complexity so that it keeps running. The Tool Server, expected to grow large and complex, was allowed to go down and restart without affecting the Motion Engine. The architecture also allowed the GUI to come and go without negatively affecting either the Tool Server or Motion Engine, and likewise for the Recorder. To accomplish this, the system was segmented into process modules that communicate via UNIX IPC and networking.
The Tool Server and Motion Engine have a block of System V shared memory in common. This enables immediate updates of critical data objects. They also communicate via two FIFOs for messaging that is sequence-critical. There also are UDP network sockets between the Motion Engine and Recorder, which stream data in soft real time to each other. The Motion Engine is what we call a near-mission-critical application, in that its failure in the field could have negative consequences for us. On-set downtime can cost a film production company many thousands of dollars an hour. It's also the nature of the motion picture industry that actors and crew may be in close contact with the animatronic machinery. It would be a bad thing to have an animatronic dog bite an actor while a technician logs in and restarts applications. That is why there is no GUI or other unnecessary code in the Motion Engine. Given our near-mission-critical requirements, the stability of the Linux operating system itself is a big plus.
The independent process architecture also aided development by allowing individual programmers to write and test more modular, self-contained pieces of code. It gave developers the freedom to use custom, and sometimes cutting-edge, programming techniques safely that weren't necessary or appropriate for the other process modules.
The timing requirements of the system are varied. The Motion Engine has to have an accurate 60Hz invocation frequency in order to update animatronic motors smoothly. Time-domain jitter in motor position data sent to a puppet adds high-frequency accelerations to that data and could cause a puppet limb to jitter. Our ultimate goal was to have a precise 60Hz Motion Engine frequency, but we planned to arrive there in three architectural iterations, as the system came on-line.
In order to prototype the system, periodic software interrupts were used to invoke the Motion Engine as a handler. Additionally, POSIX.1b SCHED_FIFO prioritization was used to make sure that once the Motion Engine was invoked, it didn't get preempted by the kernel scheduler. This allowed the Motion Engine to run in user space easily, and most importantly, in a debugger. The downsides of alarm handlers are twofold: 1) they can have jitter the magnitude of a timeslice or more, caused by a busy kernel scheduler, and 2) one can't specify their period very accurately, since it is in quantums of kernel timer ticks. We recompiled our kernels to increase the number of timer ticks per second for two reasons: to round the Motion Engine period closer to 1/60Hz and to help insure that lower-priority control system processes got descheduled more frequently, helping them all keep current with the Motion Engine state.
For the second architectural iteration, we created an RTLinux hard real-time periodic thread at a frequency of 60Hz. This thread, because it runs in RTLinux space, is about three orders of magnitude more precise than the kernel scheduler. We refer to this thread as our Hard Real Time Pacer. When it wakes up, it puts a flag into an RTLinux FIFO from RTLinux space, and our Motion Engine, blocking on this FIFO in user space, wakes up when the flag arrives. Although the Motion Engine still relies on the the Linux kernel for invocation, this architecture proved to be more accurate than we anticipated, as I/O latency is of a higher priority in the kernel than signal handling. Typical latencies of the FIFO-blocker pacer are less than 40 microseconds when there is no other heavy system activity, which means that the Motion Engine does have a true 60Hz invocation frequency, as accurate as the CPU timers can provide.
Under heavy load, the kernel may not unblock the Motion Engine on time, so this is not a deterministic hard real-time solution, but it served us well as we started using the system for production work. A dual-processor Athlon motherboard can maintain Motion Engine invocation accuracy while running the GUI, Tool Server and the Viewer, rendering OpenGL scenes in a busy loop!
For the third and final architecture we contracted FSMLabs to add an extension to RTLinux that allows deterministic scheduling of Linux user-space processes. The mechanism, called PSC, allows a sort of “jump” from a RTLinux periodic thread to user space, where we run our Motion Engine, then we fall back to RTLinux and finish. Part of our contract was that the code be donated back to open-source RTLinux for all to use.
The input devices used in the control system are a special combination of linear input potentiometers designed for both ergonomics and flexible use. They are uniquely suited for producing the types of motion needed by a puppeteer. We started with the rather aggressive design goal of running the whole control system on a Linux laptop, for maximum portability “on location”. Only when servicing CGI did we start creating 19" rackmount Control Systems.
One of the challenges of laptops was getting 64 channels of analog data into the machine. No A/D converter (ADC) drivers were available, so Steve Rosenbluth wrote one for the Computer Boards DAS16s/16. To date, no PCMCIA ADC card provides more than 16 channels, so Steve also designed an external analog multiplexer. In order to switch the multiplexer through four banks within one motion control frame (16.6msec) we relied on RTLinux, which gave us the determinacy we needed with sub-20-microsecond accuracy.
While writing low-level ADC drivers can be rewarding, it isn't the best way to spend our available R&D resources. We were elated to find during 2000 that United Electronics Industries offered both Linux and RTLinux drivers for its Powerdaq ADC cards, which we used in our PCI bus systems. Their 64-channel PD2-MF-64-333/16L worked like a charm, and UEI was responsive to our needs as the driver developed.
Although part of our design philosophy is to use as much off-the-shelf hardware as possible, we do have to design hardware for our specialized needs. Most manufacturers' idea of “small” simply doesn't come close to fitting inside an animatronic hamster.
The Digital Motor Controller (DMC), a specialized piece of hardware about the size of two postage stamps, is essential for animatronics. It distributes dozens of PWM (pulse width modulation) signals to motors inside the puppet. Steve, with the help of Glenn Muravsky, designed an SBC that was based on the Texas Instruments 320LF2406 motor controller DSP. The parallelism available on this chip allows it to do things one cannot do on a PC bus. Steve also implemented a closed loop PID algorithm on the chip for controlling custom motor servos.
Linux wasn't originally designed for hard real time at the kernel level, but with the advent of real-time extensions, we found we were able to prioritize our critical tasks while having the general-purpose operating system available.
Preemptive multitasking, memory protection, interprocess communication, networking and multimedia APIs were essential for everything else we wanted to implement that wasn't hard real-time. We found that the RTLinux architecture, where Linux itself runs as the lowest-priority task, gives us the option to add a lot of support applications on the same machine that runs real-time tasks.
The challenge of this project was not any one specific area but orchestrating the many different requirements at once without compromising the needs of any individual piece of the puzzle. Our soft real-time code needed a chance to run; our motion control code needed to run without interruption from lower-priority tasks; we needed to protect critical code segments from less important code segments; we needed to use generic kernel facilities; and we needed to use utility applications as part of our distribution. The facilities of the core operating system and the utilities around it sped our development, as we did not have to roll our own version of everything. But if we ever did run into a jam, we knew we had the source code and thus could be in control of our own destiny. So, ultimately, Linux was the only thing that came to mind that could meet all of our needs.
Incidentally, we've been running these new control systems 24/7 for three years now, and our uptimes average a few months. We frequently have reasons to move machines from one location to another, but we've never had a control system go down in production.
Where can you see the products of our new control system? We cut our teeth in 1999 with “Webisodes” featuring the Muppets on the Henson web site, www.muppetworld.com. We've done a number of live performances at tradeshows, amongst them an interactive CG Kermit the Frog for the keynote speech of SIGGRAPH 2000.
The maiden voyage of the new control system in motion pictures was in 2001 on Walt Disney's Snowdogs. The lead Husky in that film, named Demon, had an animatronic double for difficult shots—try to spot them. As we had hoped, the system performed flawlessly. We also performed an animatronic falcon for Stuart Little 2 during 2001. Horace D'Fly is a CG character who appears in an upcoming Henson home video feature Kermit's Swamp Years. We currently are planning to use the system for a sequel to Warner Brothers' Cats and Dogs, plus other feature productions involving animatronics. There is a lot of entertainment industry interest in using our CG back end to produce characters for motion pictures, television shows and video games in the near future. So, we'll see you at the movies!
Steve Rosenbluth is 34 years old, and is married to a wife and a cat. He hails from New York, DC, Florida, Montréal, Belfast and LA. He has been designing robotic creatures and related technology for 13 years. He spent his formative years learning stop-motion animation, sculpture, electronics, programming and small-business development. When not on computers, Steve enjoys biking, hiking and skating but regrets no longer performing guitar or trapeze.
Michael Babcock (email@example.com) grew up in Montana in the mountains among trees and deer. He enjoys playing basketball, hacky-sack and the guitar and listening to The Fall. He has been using Linux since 1992. His programming interests include multi-lingual software (especially Chinese and Japanese) and abstract software design techniques. He has been working at Jim Henson's Creature Shop in Los Angeles for four years on a puppeteering control system, mostly doing network, user interface and 3-D graphics programming.
David Barrington Holt (firstname.lastname@example.org) (aka DBH) is the creative supervisor of Jim Henson's Creature Shop in Los Angeles. Born and educated in London, England, he graduated with honours in Industrial Design at the height of the “swinging Sixties”, but his subsequent career has included fashion design, graphic art, photography and engineering model making. He has also trained as a psychotherapist, entertained “too many interests for his own good”, has a Russian wife from St. Petersburg and an 11-year-old son.