Deploying the Squid proxy server on Linux
To provide Internet access for users in the SAS Institute Europe, Middle East and Africa (EMEA), a number of proxy servers have been installed both at the country office level and centrally at SAS European Headquarters in Heidelberg, Germany.
These servers run the Squid proxy server software; this software is available under the GNU general public license. In brief, Squid provides for caching and/or forwarding requests for internet objects such as the data available via HTTP, FTP and gopher protocols. Web browsers can then use the local Squid cache server as a proxy HTTP server, reducing access time as well as bandwidth consumption. Squid keeps these objects in RAM or on local disk. Squid servers can be installed in hierarchies to allow central servers to build large caches of data available for servers lower in the hierarchy.
Squid has been in use for some time around SAS EMEA and is performing very well; the software is extremely stable and is delivering seamless access to the Internet for connected clients.
The original proxy servers were installed on HP workstations running release 10.20 of HPUX and Squid version 2.1. This was run on a mix of hardware but typically HP9000/720 workstations with 64MB of memory and about 4GB of disk. This configuration is difficult to support; the hardware has reached an age where failures are becoming common and the increased use of the Internet coupled with growth in the offices has left the configuration under-powered. Our main problem of late has been disk space management; the increased access patterns have left our existing log areas looking undersized at 100MB and our actual cache directories are looking rather small at 2GB.
As a result, we began researching some alternatives in order to maintain the service. Since we were happy with the Squid software itself, and we already had a good understanding of the configurations, we decided to continue using Squid but to review the hardware base.
Since Squid is an open-source project and well supported under Linux, it seemed a good idea to explore the possibility of using a Linux-based solution using a standard SAS EMEA Intel PC. This configuration is a Dell desktop PC with 256MB of RAM, 500MHz Intel Pentium and internal 20GB IDE disk. As Dell has a relationship with Red Hat, it made sense to their distribution. Also, SAS has recently released versions of the SAS product in partnership with Red Hat.
The original architecture in SAS EMEA used three central parent Squid caches with direct access to the Internet and child Squid servers in many of the country offices. Some of the smaller countries' operations still connect to the central headquarter caches, and we felt that using less expensive hardware would give us the opportunity to install proxies in these offices. Further, in many of the country operations the SAS presence is split among several offices connected via WAN links; again the less expensive hardware gives us the opportunity to install proxies in these offices. These deployments should improve the response times for web traffic and hopefully reduce the overhead on our WAN links.
Finally, we had some reservations about the resilience and availability of the original infrastructure, and we felt that with revised client and server configurations we could improve the service level of our internal customers.
Our new architecture is not much altered in principle; we still have three central servers, but they now run Linux. We are deploying more child proxies, and we require a three-level hierarchy in some offices. For example, some countries have satellite offices that only connect to the SAS Intranet via a single WAN link to the country headquarters; in these cases we will install proxies at the satellite office with a preferred parent cache in the country headquarters rather than European headquarters.
A new addition to our architecture has been the Trend Interscan Virus Wall product for HTTP virus-scanning. We have installed three virus scanning systems also running Red Hat Linux; these systems are positioned behind the current Squid parent caches, providing a virus-scanning layer between the Squid cache hierarchy and the external Internet. Since the virus scanners are simply pass-through in nature, we simply configure our top-level Squid servers to “round-robin” between them.
The original HP-UX servers were installed by duplicating a disk image from a known configuration. This was a totally unsatisfactory method for several reasons, not the least of which was that it was difficult to make provisions for maintenance of this image for patches or version updates for Squid, etc.
Our goal was a scripted and automated installation that could be performed quickly by local office staff. We have been pleased with the implementation of this concept, and it has some useful benefits with regard to recovery and configuration management (see below).
We produced a KickStart configuration to build machines for us. KickStart is a tool from Red Hat to automate system installations. Basically we can tell the install how to partition a disk, which packages (RPMs) to install and include some local configuration steps via shell commands. Our KickStart configuration is placed on a floppy disk along with normal Linux boot utilities, and we instruct the KickStart to perform installation from a CD.
This means that for a new proxy server we can arrange shipment of a PC that looks similar to our expected hardware configuration and ship a CD and floppy disk for the remote office to complete the configuration.
The installation process has been automated with three exceptions: users will be prompted for the hostname, IP information and the keyboard type (some of our offices use different keyboards for the local language). The KickStart hard-codes all other choices; for example the installation language is always English, the choice of packages are always the same and the disk always partitions in the same way.
The basic installation from placing the disk in the drive until the reboot with a freshly installed OS takes under ten minutes. This is much quicker than we could do and a huge decrease in the time it would take to perform a HP-UX installation. This obviously has some implications for our backup and recovery procedures (see below).
We have the usual problems with running our systems: configuration files need updating, software needs upgrading, log files need rotating and processes need monitoring. In the past a seriously inconsistent set of shell scripts and cron jobs had been used, or more normally, configurations had been allowed to diverge. We had replaced some of these with rdist jobs but this was not wholly adequate, so we looked around for another tool.
The best tool seemed to be cfengine. This allows us to build a common definition of tools and configuration at a central location and then distribute it to all our servers. Implementing cfengine has been highly successful, although it does require some careful planning of the configuration structure and a fairly careful reading of the documentation.
Some of the files we distribute are completely standard, and we can simply have cfengine send them “as is” to the target system. However, some are based on a common template and need alterations for each system. A good example of this would be the main Squid configuration file. In this case we ship a template via cfengine and a small script that knows how to make the transformation. We use the feature in cfengine that allows commands to be run after the receipt of a file so this shell script is run and then Squid is signalled to re-read the configuration file. This way we can keep a centrally controlled and coordinated configuration and know with certainty the status of a remote system.
In turn, our cfengine configuration is built on a local NIS map we maintain; this NIS map simply indexes host names against capabilities. For example, the keyword SQUID-CHILD is used to flag that a machine is a second-level proxy as opposed to SQUID-MASTER. This NIS map is processed to produce classes for use in cfengine; the end result of this is that configuration information is stored centrally, not on each server.
More problematic for us has been maintenance of the installed software. We are running a system largely built around Red Hat 6.2 but since installing some of the proxies, Red Hat released updates that we required. Typically security issues are a priority. We also have some locally derived RPMs, and use a later version of Squid with some options Red Hat does not include, and we use a locally built version of the gated routing dæmon that includes support for some additional routing protocols. Finally, we have some RPMs that were never in the Red Hat distribution.
The obvious step to take was to build an FTP server for updates. We have used the built-in FTP server on a Network Appliance Filer that also contains our distributions from Red Hat. We have an FTP mirror job that pulls the latest updates from a Red Hat mirror site. Our FTP server also has a tree for our local RRMs.
It's taken some time to get the process of updating RPMs correct and we're still not totally happy. The best tool we have currently is autorpm. This is configured to look at our FTP server and automatically install Red Hat updates, ignoring those RPMs we didn't install to start with, and to install or update all RPMs inside our local tree.
Our problem here is that autorpm cannot deal with some circular dependencies contained in some RPMs. It's easy to manually resolve this, but we would prefer to automate this process. This seems less the fault of autorpm and more a problem with the actual RPMs.
This problem with the RPMs has also had an impact on our installation procedures. We were very happy with the installation time being under ten minutes, and it only took a few minutes to apply our old rdist updates for files. Now though, it was taking over ten minutes to apply the RPM updates on some sites and it was a manual process requiring logging in to the system to complete. This was pretty critical for the Squid software where our Squid configuration file requires the newer version of Squid.
We also had an issue with the install itself now; there was no way that an install could complete locally without this intervention from us if we had updated some of the software. This was a fairly important issue for us. There are occasions when some type of failure is experienced at a remote office and, we are not available. In these situations we would like to remotely access the office to simply reinstall their proxy server, possibly on new hardware. This only works if we do not need to intervene in the process.
As a result, we went back and reconsidered one of the initial assumptions—we'd said that we would ship only standard Red Hat CDs and use KickStart, cfengine and autorpm to customize them. We now decided to produce our own Linux distribution largely based on the Red Hat distribution, but including our new and updated RPMs and some configuration files. The idea now was that the initial install would produce a working proxy and then our scheduled, automated jobs would come along later and tidy up any small problems.
Producing our own distribution has been pretty successful; we can produce our distribution from a new Red Hat release in about one to two days. We take the basic Red Hat distribution and remove many of the RPMs from the tree. This is not a very scientific process: we do not remove every RPM, only the ones taking up the most space. We then add our own RPMs to the tree, modify the various control structures in the tree and cut a CD.
Since our new media contains cfengine and autorpm, we can configure the post-install steps of the KickStart process to run these processes on the first boot after install. This should bring our new machine quickly up to date with our current configuration.
However, when we moved to Red Hat 6.2 we hit an interesting issue with KickStart: the new version does not necessarily prompt the user for IP addresses and other network information when they are not present in the KickStart file during an install from CD-ROM. A careful reading of the slightly updated documentation suggests this was deliberate on the part of Red Hat, but was a major headache for us. Ultimately, we rewrote the section of code in the Anaconda installer to restore the original behavior.
We looked at the various backup and recovery options and decided to use the simplest backup procedure available; namely we do not back up the machines. When we looked at the proxies it was apparent that only the log data and the cache structures were unique to any machine. The log data is periodically copied to a central location for analysis and the cache, while valuable, can be cleared and rebuilt over time.
The Linux proxies we have deployed so far have had no major service interruptions or hardware problems, but if we do have problems, we propose to reinstall the remote system. In the case of hardware failure, we expect most remote offices will be able to find a spare PC and repeat the installation.
As a result, not only are we not taking backups, but we do not need to make any provision for resilient hardware or technologies for mirroring, etc. In fact, using specialized hardware would reduce our availability and resilience since we would not have spare hardware at the remote office to make a replacement, whereas we have many standard desktop PCs of a sufficiently similar configuration that our KickStart procedure will work on them.
We expect that in the event of a failure the remote office can recover the service in around 20 minutes, given that a PC is available. We have made some effort with the client browser configuration to make this transparent to the desktop user. This gives us a highly available solution.
The old proxy configuration relied on the clients being configured to use a named proxy server explicitly. For the sites where more than one proxy server was available, the hostname used for the proxy server was in a domain managed by lbnamed. The version in use had been modified slightly but was still unsuitable for our use. lbnamed used rpc.rstatd to get loading information from machines in a pool, and then depending on the weighting, lbnamed will return different hosts, thus creating some limited load balancing. In practice this does not distribute effectively although it has the useful feature that if a machine is unreachable it will be removed from the pool. Unfortunately, this useful feature is undermined by the fact that only load is used (in the basic Perl version) to weight hosts. If our Squid server dæmon dies the load on the machine tends to to be reduced, which can leave a machine where a failure has occurred at the top of the pool. There is an implementation in C that can look at other factors apart from load but some experimentation with this was not fruitful. Our overall impression was that, as previously installed, this was more successful than a standard DNS round-robin would have been.
Our testing was performed against our three new Linux proxy servers, and one factor we noted was that response time could be improved if we sent the HTTP query to the machine most likely to have the object in cache; of course this seems a fairly obvious but not immediately useful observation.
In fact, the idea of intelligently selecting the cache to query is a fairly simple thing to achieve using the Proxy Auto Configuration (PAC) file feature supported by most mainstream browsers. This basically entails having a web server somewhere that can return a proxy PAC script.
Our task was made doubly easy because someone had already produced some sample code for PAC files that balanced queries across several servers. We took as our base the work done at Sharp on a “Super Proxy Script” using URL hashing. This is a simple but ingenious idea that hashes the URL being requested and then returns a proxy to use. This is statistically random in terms of load distribution over large numbers of URLs, but repeated queries for the same URL will always return the same proxy. We also make use of the ability in the PAC script to return an array of proxy servers; the effect is that if the first named proxy fails then queries by the client are routed seamlessly to the next proxy in the list.
At the headquarter sites, we return arrays based on two or three proxy servers depending on the campus location. For sites where only a single proxy is available, we do not use URL hashing and only return a pair of proxies, namely the local proxy server and a central server for fallback.
The use of this central fallback for remote users is the feature that gives us the most resilience. Should a remote proxy fail the remote clients for that proxy will detect the failure and use the central host, the clients will check every 30 minutes (MSIE and Netscape) to determine if the original server is active and return to using it if it is.
In fact, since we have some clients using Microsoft Internet Explorer version 5.0, we name the proxy.pac script as wpad.dat. This allows “unconfigured” IE5 clients to locate the wpad.dat file automatically using the WPAD method of searching for a URL of the form http://wpad.local.domain/wpad.dat.The use of WPAD is not particularly critical to us but it is a useful feature. Reviewing the logs during implementation suggests that we may be saving our help desk some calls from mobile users who would otherwise have required help setting their proxy manually when visiting other offices.
Currently we use Apache web servers to return the PAC scripts, and the Apache runs either on the local proxy host or another locally available web server. It is possible that using one of the stripped-down web servers, for example, several are implemented in Perl, may be more secure and represent less overhead. We have not explored this approach yet.
Our WPAD servers currently have multiple address records in DNS so in the event of a single WPAD server failing, we have some resilience. For sites with only a single WPAD server, we rely on new client sessions using previously cached settings.
There is a small flaw in this approach where remote sites have multiple proxies: the URL hashing carefully selects which local proxy to use but then the proxy simply round-robins the query over the three central systems. We could use some of the facility of Squid to exchange cache digests so the local proxy would forward requests to the central server most likely to generate a hit, but in practice the cost in bandwidth on the WAN links makes this ineffective. Instead we let the round-robin query any central server and then have the central servers use cache digest exchange to generate a hit if possible.
We anticipate that we will want to keep our proxy servers up to date with a reasonably current Red Hat version. While we will rely on cfengine and autorpm to make the small alterations in configuration, we do not expect that upgrading the entire OS over the network is really feasible for us.
So, instead, we intend to ship our fresh installation to the remote office and have them carry out a new install if we want to upgrade. We expect this will occur about three or four times a year possibly. Because we have such a clean installation process and tight control over configurations, there's little penalty for making such frequent upgrades. Since the remote office would retain the previous media, regression to an earlier version should also be fairly straightforward.
We are very eager to take these frequent updates as we experience many problems on the HP-UX proxies directly related to lack of software and patch updates. For example, we see HP-UX problems occur for which patches are available, and have an old version of Squid and many tools we try to run since old versions of Perl are available.
An upgrade can be done during a normal business day for a remote office, and for the ten or 20 minutes downtime the clients will simply fall back to the central standby servers, or for sites with multiple servers, they use their own local fall back servers.
For planning purposes the most useful data is a historical view of the activity the cache is seeing. We gather this data using MRTG (Multi Router Traffic Grapher) and the built-in SNMP agent in Squid. This has been a little awkward to configure, but we are able to create some useful graphs of the performance of the proxies. We collect many metrics from the proxies that are available from an internal web page. We have some code that generates an index page of all servers by walking our NIS map of hosts (see above); this code also includes a thumbnail of some key metrics. We are particularly interested to see the trends for cache hit and cache miss times as well to track any overall growth in requests.
We also use the Calamaris tool (see Resources) to produce snapshots of the status of the proxies and some analysis of the logs.
We also have a copy of the HP OpenView network management products. Currently we are using this to monitor the status of the machines only, but we plan to customize it to monitor the health of the remote Squid software to simply alert us if the software fails.
We have also started to use cfengine running from cron every five minutes to check the health of various processes and to attempt an automatic restart.
Additionally, we deliver Internet usage reports to the local offices. Currently we collate the logs into an SAS dataset and use SAS reporting tools to produce reports, but we also have products like SAS IT Service Vision and SAS WebHound that are able to produce similar analysis of traffic. These are powerful tools, and we can use them to provide a much fuller analysis of the data.
We have been pleased with the stability and performance of our Linux solution. There's no doubt that the reduced hardware costs are allowing us to install proxies in locations where previously it was not cost effective to build more resilience for other locations. Since our new configuration is more resilient and, by and large, better configured than the previous one, we have less problems than with the old proxies.
The worst problem we have seen operationally so far has been file system corruption. We have had a remote proxy suffer a power failure and then fail to boot because of file system problems. As an interim measure we have amended the startup code to be more tolerant of these failures, but a more long-term solution may be to use one of the log-based file systems that are becoming available.
We are beginning to see that more local links to ISPs are being deployed in our offices. As a result we will need to fine-tune our configuration to support HTTP virus scanning at an office level and direct access from the proxy to the Internet. In practice this will simply mean adding some tasks to cfengine and autorpm to install and configure the new modules.
Since we now have a procedure to produce our own Linux distribution media, it is likely we will review a more generic Linux deployment internally. This will occur to some extent anyway now that SAS products are available on Linux; we hope producing standard distributions for internal use will give us some loose configuration control. We will likely consider what other functions might usefully run on Linux; for example, we may move some DNS/BIND functions to Linux.