The OpenPhone Project—Internet Telephony for Everyone!
The standard PSTN digitization technique is described by the ITU G.711 specification. This document describes an 8KHz sampling rate using 8 bits per sample, for an effective 64Kbps data rate. There are two subsets of this technique: A-Law and Mu-Law. These are scaling factors that take into account the sensitivity of the human ear and allow for a more efficient utilization of the 8-bit encoding space for recording the human voice. Both are in wide use and are considered standard codecs.
However, 64Kbps is excessive for Internet telephony. Two-way audio encoded in this manner yields 128Kbps of audio alone (not counting signaling or overhead), making it impractical for use across a dial-up link, and wasteful of bandwidth even on a LAN or WAN link. Voice compression is the answer, and there are several options from which to choose.
Audio codecs can either be implemented in software or provided by the telephony interface card using an on-board DSP. On a lightly loaded PC, the extra processing load incurred by the encoding is not particularly burdensome, although it can be on more heavily utilized machines. Software-only solutions can add some latency, since the encoding usually happens in a user-space program and is thus at the whim of system load and the scheduler. Internet telephony cards use on-board DSPs to perform the encoding, virtually eliminating the load on the host CPU and the associated latencies.
Perhaps the most widely used codec in Internet telephony is G.723.1. This codec provides either a 6.3 or 5.3Kbps data stream of packets that hold 30ms of audio each. The encoding requires 7.5 milliseconds of lookahead, which, when combined with the 30-millisecond frame, totals 37.5ms of coding delay. This is the minimum theoretical latency using this codec. Of course, real-world latencies will include the time on the Internet and application-level processing at both ends (and the jitter buffer—mustn't forget that).
G.723.1 is patented technology. The patent holders charge royalties for its use, making it impossible to use in open-source software. However, most telephony boards (including the hardware used by the OpenPhone Project) include this codec as part of the board price. Aside from other technical advantages of telephony interface boards over sound cards, the inclusion of licensed compression codecs on the card is perhaps the single best reason we need these cards. With the codec in hardware, we can write open-source software and not violate any licenses. I'd like to point out that the G.723.1 codec is very different from the G.723 codec (a much higher bandwidth codec in which the source code is widely available, but not often used due to high bandwidth requirements).
Rapidly gaining in popularity are the G.729 and G.729A codecs. These codecs use an 8KHz sampling rate, a 10ms audio frame and a 5ms lookahead buffer, for a total of 15ms processing latency. Because the frames are only 10ms long, this codec suffers less from packet loss—the software has less to recover, given the loss of any one packet. G.729A is a version of the codec that uses fewer DSP resources, making it easier to implement on lower-performance (cheaper) DSP chips. The audio quality these codecs can deliver is astounding. I recently experienced a LAN-based call using the G.729A codec and stood next to the person calling me. I could detect no perceptible delay between the direct path (his mouth to my ear) and the link across the network. If I had not known otherwise, I would have sworn it was a normal PSTN call. This codec is the future of Internet Telephony.
In-band signaling includes all the tones you normally hear in the course of a call, including ringing, busy-signal, fast-busy and the all-important dual-tone modulation frequency digits (DTMF). These tones are a normal expected part of using a telephone, but unfortunately, many do not compress well with the algorithms discussed above. One alternative, the one chosen by the OpenPhone project, is to pass these signals as a separate data type within the real-time audio stream. This has the advantage of ensuring that the signals are reproduced accurately at the other end of the connection regardless of the audio codec needed, and it reduces the bandwidth needed to convey the data.
This technique is described in an Internet Draft (see Resources). It basically specifies a new RTP data payload type; the application layer simply sends the data in the stream along with the normal audio. One very tricky aspect needing more work is the code required to remove the actual audio signals from the audio data stream prior to compression. Should these tones and signals get compressed, the codecs will distort these signals at the far end, reducing the chances of properly detecting the signal or tone. It's far better to filter these signals out prior to compression and to pass the signal across the network directly, allowing the other end to reproduce the signal with perfect accuracy.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- The Italian Army Switches to LibreOffice
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Linux Mint 18
- Oracle vs. Google: Round 2
- The FBI and the Mozilla Foundation Lock Horns over Known Security Hole
- Varnish Software's Varnish Massive Storage Engine
- Devuan Beta Release
- Privacy and the New Math
- Ben Rady's Serverless Single Page Apps (The Pragmatic Programmers)
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide