GNU Bayonne Is for Telephony

The Bayonne Project makes running telephony software on any vendor's card possible.

Three years ago I came to realize that we had a serious need in free software. Although free software had expanded to fill almost every other void in the enterprise infrastructure, we had not addressed the needs of telecommunications. Telecommunications are not only a part of the infrastructure of every business, but they are also an often overlooked part of the desktop user's experience. At the same time, the hardware required to create telephony services for the public telephone network has become more widely available under commodity PC platforms and operating systems, including GNU/Linux.

In choosing to address telecommunications with free software, I and a few others decided to create a framework describing what all these services might be, ranging from the needs of desktop users and application programmers to the needs of the largest commercial carriers. This project later became known as GNUCOMM when it was officially folded into a GNU project working group.

One area we chose to define was the idea of a telephony application server. Such a server should make it both possible and easy to create and deploy new telephony application services. These would be applications specifically written to interact with real people that call the server over regular telephone lines and interact with the application with both a voice and a telephone keypad.

Applications of this nature typically include things like voice-mail systems or prepaid (debit card) calling platforms. All of these systems are complex and sometimes programmable systems and specialized computer telephony hardware are needed to provide an interface between the PC platform and the public telephone network. This can be hardware that talks to individual analog telephone lines or even hardware that provides multiport voice control over ISDN and T1 digital voice circuits, which larger enterprises can get directly from a local carrier's central office.

With full consideration that such systems in the past were generally very expensive, always proprietary and often hard to program, I chose to solve all of these problems at once by writing a server under the best supported free software platform available at the time: GNU/Linux.

When we started the project, few companies provided telephony hardware under GNU/Linux, so we used what was available. Even now, each telephony card is different from every other one and tends to include its own API. Since neither the hardware nor the APIs are in any manner standardized, most people that produce telephony applications do so for only a single vendor's card family, and they do so using exclusively the vendor's supplied API. This practice also means that any vendor in the computer telephony business has to provide a very broad family of hardware because one could not substitute easily other products to fit gaps in a product offering. All these things have made it difficult for new telephony card vendors to come into existence and easy for the limited vendors that do exist to maintain their markets without much change.

This is not to say that no efforts were made to standardize APIs. After all, there is the ECTF (European Community Telework/Telematics Forum). Being an industry consortium of proprietary vendors, they would have to come up with, through committees, a complicated set of standards and proposals for how proprietary vendors could develop and maintain computer telephony solutions. Furthermore, they would need to do so in ways that expand the need for specialized knowledge, increasing the stranglehold of their existing members on the computer telephony marketplace.

Another popular organization is the ITU (International Telecommunications Union), best known for the fact that appointment often is handled by national governments. In the US, for example, this is done as a political appointment by the state department, rather than from among the best and brightest minds.

Our goal was not only to produce a telephony server as free software, but we wanted also to make telephony application services as readily and easily approachable as creating and administering a web site. We also wanted to abstract the telephony driver and APIs to the point that they were both irrelevant and invisible in the development of application services. Doing so would mean anyone could substitute hardware as they wished, rather than being locked into the offering of a single vendor.

First Came ACS

Since we wanted to abstract everything within the server at a low level, the first thing we needed was a portable class foundation written in C++. I wanted to use C++ for several reasons. First, it seemed natural to use class encapsulation for the driver interfaces because of their abstract nature. Second, I found I could write bug-free C++ code faster than I could write C code. In fact, this would become my first large-scale C++ project.

Why we chose not to use an existing framework is also simple to explain. We knew we needed threading, socket support and a few other elements. No single existing framework did all these things except a few that were larger and more complex than we needed. For example, we wanted a small footprint for a telephony server. The most adaptable framework at the time was ACE (Adaptive Communication Environment), which typically added several MBs of core image for the runtime library. Since we were looking at running on machines with as little as 8-12MBs of memory, this seemed an unacceptable overhead.

GNU Common C++ (originally APE) was created to provide an easy-to-comprehend and portable class abstraction for threads, sockets, semaphores, exceptions and so on. APE has since grown and is now used as a foundation for a number of projects in addition to being a part of GNU.

As to creating services themselves, we realized we needed a new way to create telephony applications—one that would make the process approachable for the average system administrator. For simplicity we choose to use a common scripting language, which later became known as GNU ccScript. By writing scripts and recording audio samples to create telephony application services, virtually anyone could participate without needing specialized knowledge or deep understanding of fantastically complex APIs like those promoted by the ECTF. Because the underlying telephony hardware is both invisible and abstracted away from the application scripting language, the cycle of dependence on using a single card family is also broken.

But what form should this new scripting language take? Many extension languages assume a separate execution instance (thread or process) for each interpreter instance, making them unsuitable for our project. Many extension languages assume expression parsing with nondeterministic runtime. An expression could invoke recursive functions or entire subprograms, for example. Again, we did not want to have a separate execution instance for each interpreter instance, and we did not want to have each instance respond to the leading edge of an event callback from the telephony driver as it steps through a state machine, so none of the existing common solutions like Tcl, Perl, Guile, etc., would immediately work for us. Instead, we created an entirely new nonblocking and deterministic scripting engine for our first server.

Our scripting language is unique in several ways. First of all, it is step executed and nonblocking. Statements can either execute and return immediately or schedule their completion for a later time with the executive. This allows a single thread to invoke and manage multiple interpreter instances. While a telephony server can potentially support interactions with hundreds of simultaneous telephone callers on high-density carrier scale hardware, we do not require hundreds of native “thread” instances running in the server, and we have a very modest CPU load.

Another way our scripting is unique is in support for memory-loaded scripts. To avoid delay or blocking while loading scripts, all scripts are loaded and parsed into a virtual machine (VM) structure in memory. When we wish to change scripts, a brand new VM instance is created to contain these scripts. Calls currently in progress continue under the old VM, and new callers are offered the new VM. When the last old call terminates, then the entire old VM is disposed of. This allows for 100% uptime, even while services are modified.

Finally, since we were building a C++ scripting system, we allowed direct class extensions of the script interpreter as a means to add new script functionality. This allows one to create a derived dialect specific to a given application or, if needed, specific to a given telephony driver, simply by deriving it from the core language through standard C++ class extension.

While the server scripting language can support the creation of complete telephony applications, it was not designed to be a general-purpose programming language or to integrate with external libraries the way traditional languages do. Nonblocking requires that any module extensions created for the server be highly customized. Instead, we wanted a general-purpose way to create script extensions that could interact with databases or other system resources. To that end we chose a model essentially similar to how a web server did this when our ACS (Adjunct Communication Server) Project was created.

The TGI model for our server is similar to how CGI works for a web server. In TGI, a separate process is started, then is passed information on the phone caller through environment variables. Environment variables rather than command-line arguments are used to prevent snooping of transactions that might include things such as credit-card information that could be visible with a simple ps command.

The TGI process is tethered to the server through stdout and any output the TGI application generates is used to invoke server commands. These commands can do things like set return values, such as the result of a database lookup, or they can do things like invoke new sessions to perform outbound dialing. Rather than creating a gateway for each concurrent call session, a pool of available processes are maintained for TGI gateways so it can be treated as a restricted resource. It is assumed that gateway execution time represents a small percentage of the total call time, so maintaining a small process pool always available for quick TGI startup is efficient. This helps to prevent stampeding if, say, all the callers hit a TGI at the same moment.

With these basic tools, it was possible to create interactive voice response applications. As soon as it was functional, our first telephony server was used commercially by Open Source Telecom and other companies. This wide adoption was a result in part of how simple it is to create new application services and to integrate telephony applications under this server with other aspects of a commercial enterprise. As noted, the only requirements are some skill in constructing a server-side script, the ability to play and record audio and some knowledge of common tools like Perl.

A typical application for our server might look like the one shown in Listing 1 [available at], the playrec script. This script demonstrates the different concepts in the current scripting language, including symbol scope and event trapping, which, used under named script references, form a chain of logic for processing an interactive telephony application. In Listing 2 [available at], we have an example of the server's use of Perl with the module and the Perl script.