Linux Means Business: Linux for Internet Business Applications
When you call an 800 number to complain about a dead bug in your cereal or to ask why your new modem doesn't work on your old 486, chances are you're not talking directly to the manufacturer of the product. As companies throughout the '80s and '90s have continued to shed those business functions not considered core strengths, the vertical market of call-center outsourcing has grown rapidly. Ruppman Marketing Technologies in Peoria, Illinois is one of the pioneers in this industry, having answered telephone calls for client firms for 26 years. As the Internet's expansion into mainstream usage has become impossible to ignore, it look to expand its market to customer service over the Internet.
I was hired in April 1997 to oversee this new territory. Ruppman (“Rules for Writers”) as well as many of its competitors had made vague and hesitant steps toward answering e-mail inquiries from web sites and sending brochures requested through web forms, but our CEO entrusted me with leap-frogging such timid steps and positioning Ruppman for the inevitable time when people would more likely check a web site for answers than call an 800 number.
I came in with a hobbyist Linux background, and it was immediately clear to me that the development budget was limited and a broad range of technologies was available to deploy in a finite amount of time. Linux was the only solution with the flexibility and price to achieve our goals.
Ruppman had advanced infrastructure in its traditional areas of phone switches and routing, but had evolved its data infrastructure rather haphazardly. The company had a hodge-podge of Internet access methods, including:
Dial-up Compuserve and AOL
A UUCP e-mail exchange with a local ISP
A leased 256KB on a small Microsoft Exchange server from another ISP
Indeed, there was a growing number of Microsoft Exchange users in the absence of an official e-mail client standard.
First, we ordered and set up a Dell Poweredge 2200 (Pentium II, 200MHz, 64MB RAM) with Caldera OpenLinux to be our e-mail post-office (using Sendmail) and primary domain name server (using BIND). We finalized a deal to install a firewall from AT&T. Now, we had a unified Internet gateway, and could shut off all the other expensive or insecure conduits, thus removing the need for modems in the offices (see Figure 1). This also allowed us to take full advantage of our registered ruppman.com domain name, standardizing our e-mail addresses to the format firstname.lastname@example.org.
Using Sendmail also allowed us to implement a common client requirement. When Ruppman handles customer e-mail, clients want it to appear as if they are handling the e-mail themselves. For this reason, it is unsuitable for Widget Inc. to point its customers to email@example.com. They would rather use firstname.lastname@example.org. This is easy enough for incoming mail, but we have to mask outgoing mail from Ruppman representatives to Widget's customers. This was done by, among other things, adding the rule in Listing 1 to an mc configuration file for Sendmail, then compiling the mc file with m4 and installing it as /etc/sendmail.cf. (See the Mail HOWTO for information on customizing Sendmail rules.)
Clients also wanted such features as auto-replies to e-mail queries and selected audit copies of outgoing mail. Also, many clients' volume required more than one representative, using either subject-based routing or a shared mailbox. Microsoft Exchange could handle some of these functions, but we needed more flexibility and were concerned about standards compliance. The client mailboxes were all set up as IMAP mailboxes on a Linux server, which gives us the following advantages:
All incoming mail is delivered through a procmail recipe which allows us to send courtesy responses, keep detailed records and set up mail routing of any complexity.
Outgoing mail was sent as a blind copy (bcc) to a special account, audit, which runs a Python script to select a random subset of outgoing messages to forward to client contacts.
Since IMAP allows all messages to be stored on the server, it makes shared mailboxes easy to access and manage.
The e-mail representatives use Netscape Communicator as an IMAP client, but because of bugs in its IMAP client interface, we are evaluating alternatives.
Our new Internet architecture has the additional advantage that we have a way to allocate Internet access costs to departments according to usage. We implemented a Python script on the main Linux server to parse the firewall logs collected by syslogd and produce a report of bytes used per department.
Building an Intranet soon became our next initiative. We installed the Apache WWW server and the Samba Netbios server on the same Dell Linux server. Samba was used to export Linux directories as public shares from the largely Windows 95 user base, or as password-protected private shares for Internet Services, our department. Other departments started attaching data to our Intranet at an amazing rate. Clearly, this simple but powerful technology had filled a big need for information-sharing tools. Both Apache and Samba functions are heavily used throughout the company and have held up quite well. In fact, although we have since off-loaded some functions to other servers, for several months one Pentium Pro-based server running Linux ran mail, DNS, central logging, IMAP, SMB and WWW for over 1000 users with little or no downtime.
We used native Linux tools such as the DBM database and Python utilities such as the calendar suite to add useful content to the Intranet as well. We publish a phone list, which is frequently updated, and a list of Ruppman clients. We keep a calendar of Internet Services activities and schedules on the Intranet and access to a database of people with proxy access to the Internet.
These systems quickly brought Ruppman to a point of basic Internet competence, but far more was required. Preparing for the future of customer service on the Internet involved quite a bit of application development, so a team was assembled in my group for this purpose.
The development team began using a combination of C, C++ and shell scripts, but we quickly settled on Python as our overall development language. Our lead software engineer and I had used C++ as the cornerstone of our previous careers, but we soon came to admire Python's expressive power, comprehensive library and clean syntax. We purchased a Compaq ProLiant 2500 (Pentium II, 300MHz, 64MB RAM) as a development server and failover backup. We anticipated running SCO UNIX on it, but being used to the broad toolset that comes with Linux distributions, we found SCO UNIX to be woefully inadequate in comparison. Efforts to compile or install our favorite tools proved so cumbersome that we quickly abandoned SCO for Caldera OpenLinux. Unfortunately, we then found that Compaq servers are not well-suited for Linux. Compaq adds many proprietary features for its ManageWise server management suite and has not ported the “agents” for these features to Linux, so much of the machine's design has to be bypassed in order to run Linux. Perhaps for this reason, this machine has proved rather slow running Linux, and we are in the process of replacing it with a Dell Poweredge 4200 (Dual Pentium II, 300MHz, 64MB RAM).
The first major development task was to create an Internet dealer locator. This popular web site feature is an application that allows the customer to enter his or her address or zip code, and receive a list of nearby dealers or service centers. Ruppman already had such an application running on a mainframe for telephone representatives, but Internet Services decided to build a locator from scratch using an object-relational database and a geographic-matching (geo-matching) module. We chose PostgresSQL as the database, because it is object-relational and supports spatial relationships (r-trees). It also has a native Python interface, PyGres. The resultant application is heavily disk-I/O bound, and we ended up buying a Sun Ultra Enterprise (Dual UltraSPARC2, 250MHz, 128MB RAM) for its high-bandwidth backplane and its hardware scalability. I have since come to learn more about comparable Linux-based setups on Alpha and even Sun boxes.
Another product developed in my group is a Usenet and web monitoring service, where we search Usenet and the WWW on behalf of clients for consumer discussion of their company or product. First, we clip articles according to a search engine, then our representatives check the clips for relevance. We set up a Linux server and installed NNTP on it, so that /var/spool/news can be searched with a Python script that invokes a recursive grep. Hits are then accumulated in a file which is combed by a representative using a custom web interface.
Figure 2. Employee Time Log
Figure 3. Project Development Manager
Keeping daemons and applications up to date on a production server is an important part of security and standards adherence. The widespread availability of Linux news and resources has helped us greatly in this regard. We often found when working with other departments that servers based on other operating systems tended to suffer from version lag. Some NT servers were not patched to protect against the rampant teardrop denial-of-service attack, and we found that a mission-critical HP 9000 box was running daemons from 1994, including Sendmail, which is often a hacker target. Most of the time, the reason for the lag was that updates are not easy to keep track of or even apply for such environments as NT and HP-UX. To some extent it is a matter of system administrator vigilance, but the Linux community makes it exceptionally easy to stay responsible.
However, we recently decided that keeping up the aging RPM set from Caldera OpenLinux 1.1 files was becoming an excessive chore. Our tests had shown some advantages to the features of the GNU glibc library, so we upgraded all of our Linux machines to Red Hat 5.0. Besides problems with Disk Druid and the strange fact that the install doesn't set up the /etc/hosts and in.ftpd files properly, we've been very satisfied with the new distribution. The disadvantage is that we lose the benefit of Caldera's Novell Directory Services client, just as the rest of our organization is migrating to Novell Intranetware.
In all, Ruppman has proven a remarkable test case for the suitability of Linux in real business applications. The exceptional robustness of Linux has enabled us to maintain a high service level within our group, and its flexibility and broad toolset have enabled us to quickly solve a wide variety of problems that would require a lengthy research and a significant investment under other platforms. The most common reservation about Linux from IT types involves technical support, but in almost a year, we have never had to call Caldera or Red Hat. We solved almost every one of our problems with a query on http://www.dejanews.com/, an excellent Usenet archive and search engine. While I have been very lucky to receive little management interference with my technology choices, I am convinced that if Linux advocates can sneak our favorite OS into a moderately visible application, its low cost and high performance will begin breaking down barriers to its acceptance. I hope my experiences at Ruppman provide some inspiration in that direction.