VOCAL: Open Source VoIP Software for Linux
While most open-source projects are applications and utilities intended for single users, we did something different. We created an infrastructure project: a Voice-over-IP (VoIP) phone system that either can run on a single box attached to a couple of IP phones or can scale up to a network of hosts processing hundreds of calls between thousands of users.
Our project is known as the Vovida Open Communications Applications Library (VOCAL), a fully functional phone system that can run on either Red Hat Linux or Sun Solaris. Our code has a BSD-style license, and you can download it directly from our web site (http://www.vovida.org/), or check out the latest code from our CVS server.
In this article, we discuss how the state machine for our user interface has been implemented both from a state machine/operator view and a source-code view. We're hoping that by reading this article, you will be encouraged to log on to our site and check out our stuff, literally.
VOCAL is based on an Internet Engineering Task Force (IETF) communication standard called Session Initiation Protocol (SIP, http://www.ietf.org/rfc/rfc2543.txt). The SIP standard describes signaling methods and a series of network components including a user agent (UA), which is the primary user interface offered by VOCAL. The UA can be a special piece of hardware, like an SIP IP Phone, an adapter that translates between analog phone signaling and SIP, or software (a ``softphone'') that runs on a PC or another type of host. In the examples below, we are assuming that the user is making calls through the VOCAL SIP UA.
The VOCAL SIP UA is a softphone that was built on top of an SIP Proxy code base, which gives it a framework for its state machine, threads and call-processing data structure. The user presses A to activate the UA and Z to deactivate it. Figure 1 is a state/operator diagram that illustrates the interaction that occurs between states, events and operators when the user makes a call.
Let's walk through the state/operator diagram. At the top, the UA is On-hook (inactive) and in the Idle state. When someone wants to make a call, he or she presses A to activate the UA, taking it off-hook. This makes the device generate an off-hook event, which is entered into a FIFO database within the UA. The OpStartCall operator retrieves this event, processes it and moves the UA to the next state, called Dialing.
As the user enters a phone number, multiple dual-tone multi-frequency (DTMF) digit events arrive, each calling the OpAddDigit operator (not shown in the diagram). When the user has finished dialing, a Dialing Complete event occurs, calling the OpInviteUrl operator to send an SIP INVITE message out to the VOCAL system for routing and forwarding. Sending this message moves the UA to the Trying state.
After VOCAL has routed the call, the other end will start ringing and eventually the user will answer the call. (In this example, the calling party is available and willing to answer the call.) Answering the call makes the far-end UA send an SIP 200, OK message back to the caller, indicating that the call session has been established. The OpFarEndAnswered operator processes this message and takes the UA to the InCall state.
At this point, the call is in progress until one of the users hangs up, thus generating an On-hook event, which the OpTerminateCall operator processes. This operator sends an SIP BYE message to the far end, which ends the call and takes the UA back to the Idle state where we began. Having looked at a complete call flow, let's shift our attention to the code that makes these operators tick.
Looking at our CVS tree under vocal_all/sip/ua, the states and operators are implemented as a series of C++ files. Each state inherits from the base State class (vocal_all/sip/base/State.cxx), and our developers have written operators to enable transitions to new states. All registered operators are called when an event (such as activating or deactivating the UA, receiving an SIP message, etc.) occurs to evaluate whether the event is relevant to their purpose in life. If the event is not relevant, the operator ignores the event. Otherwise, it springs into action.
Listing 1 shows some of the code for the OpStartCall operator (vocal_all/sip/ua/OpStartCall.cxx), which is called when the UA is activated and before the first digit of the phone number has been entered. The important function for an operator is the process() function, the function that is called when an event arrives.
The code shown in Listing 1 looks at the type of event that has been received from the hardware. If it isn't a DeviceEventHookUp (someone activating the UA), it is ignored. Perhaps some other operator will respond, but this isn't an event that the OpStartCall operator is meant to handle, so it does nothing and suggests no alternative state.
If someone did just activate the UA, OpStartCall performs the processing needed for this transition. The digitCollector is reset to begin collecting the digits as the UA buttons are pressed, and after ensuring that this state machine really exists (can't be too careful, you know), OpStartCall looks up the state for Dialing and returns that to the uaBuilder, which moves the state machine to Dialing. In this scenario, OpStartCall should be the only operator to respond to the DeviceEventHookUp event. If multiple operators were to respond and suggest different states, we would consider that to be an error.
In the Dialing state, digits are collected until the user has entered a complete phone number. We don't look at how the UA knows that the dialing has been completed, but we will look at what happens afterward. The UA is ready to send out an SIP INVITE message to try placing the call. Listing 2 [available at ftp://ftp.linuxjournal.com/pub/elj/listings/issue09/5480.tgz] shows the process method of the code for operator OpInviteUrl. This is the operation that moves the UA from the Dialing state to the Trying state, where it tries to establish a call with the UA on the far end.
The message generated by the code shown in Listing 2 could look like the following. Again, some SIP headers and the SDP attachment have been removed for clarity:
INVITE sip:email@example.com:5060;user=phone SIP/2.0 [126.96.36.199:5060->188.8.131.52:5060] From: VOVIDAroast1<sip:184.108.40.206:5060> To: 8040000000<sip:firstname.lastname@example.org:5060; user=phone> Call-ID: email@example.com Contact: <sip:220.127.116.11:5060>
Essentially, OpInviteUrl is the operator that initiates the call by sending the SIP INVITE message. First it checks the phone number that has been dialed to make sure that is a valid number (digitCollector returns false if dialing is not complete). If the number is valid, the operator decides which proxy server to use and how to route the call properly to that server.
The lines that follow the server address resolution set a number of fields in the SIP message. A Call-ID is generated, which is a unique number used to identify this particular call. The specification of how to generate this number, as well as details about the format of all the SIP headers, are available in RFC 2543.
After generating the Call-ID, the Request Line is created. This is the top line message, essentially the Universal Resource Locator (URL). This specifies whom the UA is trying to reach, including name (8040000000 in this example), host (18.104.22.168) and port. It also specifies that the other endpoint is a phone, rather than another type of device.
Two headers are generated to permit return signaling: the From field, indicating where this message originated, and the Contact field, which could be different from the From field. Ideally, the user will respond to Contact, but the message might originate from a different source. Finally, after setting these fields in the message, the UA references the SIP stack using the getSipStack() method and sends the message out over the wire to its destination.
We've only given you a little taste of the base code--the state machine and SIP message/stack that is available--but there is much more. We urge you to go out and learn more about this exciting technology; visit us at http://www.vovida.org/, sign up for the mailing lists and download VOCAL. Check it out; we've written this for you.
David Bryan (firstname.lastname@example.org), formerly a senior software engineer with Vovida Networks and Cisco Systems, Inc., is now vice president of engineering at Jasomi Netorks, Inc., based in San Jose, California. David has ten years of software development experience in both industry and academia. David Kelly (email@example.com) is a senior technical writer for Cisco Systems, Inc. David has seven years of technical writing experience and as a staff writer for Vovida Networks before they were acquired by Cisco in 2000.