Paranoid Penguin - Building a Secure Squid Web Proxy, Part I
Consider the venerable Web proxy—back when the Internet was new to most of us, setting up a Web proxy was a convenient way to grant users of an otherwise non-Internet-connected network access to the World Wide Web. The proxy also provided a convenient point to log outbound Web requests, to maintain whitelists of allowed sites or blacklists of forbidden sites and to enforce an extra layer of authentication in cases where some, but not all, of your users had Internet privileges.
Nowadays, of course, Internet access is ubiquitous. The eclipsing of proprietary LAN protocols by TCP/IP, combined with the technique of Network Address Translation (NAT), has made it easy to grant direct access from “internal” corporate and organizational networks to Internet sites. So the whole idea of a Web proxy is sort of obsolete, right?
After last month's editorial, we return to technical matters—specifically, to the venerable but assuredly not obsolete Web proxy. This month, I describe, in depth, the security benefits of proxying your outbound Web traffic, and some architectural and design considerations involved with doing so. In subsequent columns, I'll show you how to build a secure Web proxy using Squid, the most popular open-source Web proxy package, plus a couple of adjunct programs that add key security functionality to Squid.
The last time I discussed proxies in this space was in my December 2002 article “Configuring and Using an FTP Proxy”. (Where does the time go?) A quick definition, therefore, is in order.
The concept of a Web proxy is simple. Rather than allowing client systems to interact directly with Web servers, a Web proxy impersonates the server to the client, while simultaneously opening a second connection to the Web server on the client's behalf and impersonating the client to that server. This is illustrated in Figure 1.
Because Web proxies have been so common for so long, all major Web browsers can be configured to communicate directly through Web proxies in a “proxy-aware” fashion. Alternatively, many Web proxies support “transparent” operation, in which Web clients are unaware of the proxy's presence, but their traffic is diverted to the proxy via firewall rules or router policies.
Just because nowadays it's easy to interconnect TCP/IP networks directly doesn't mean you always should. If a nasty worm infects systems on your internal network, do you want to deal with the ramifications of the infection spreading outward, for example, to some critical business partner with whom your users communicate over the Internet?
In many organizations, network engineers take it for granted that all connected systems will use a “default route” that provides a path out to the Internet. In other organizations, however, it's considered much less risky to direct all Web traffic out through a controlled Web proxy to which routes are internally published and to use no default route whatsoever at the LAN level.
This has the effect of allowing users to reach the Internet via the Web proxy—that is, to surf the Web—but not to use the Internet for non-Web applications, such as IRC, on-line gaming and so forth. It follows that what end users can't do, neither can whatever malware that manages to infect their systems.
Obviously, this technique works only if you've got other types of gateways for the non-Web traffic you need to route outward, or if the only outbound Internet traffic you need to deal with is Web traffic. My point is, a Web proxy can be a very useful tool in controlling outbound Internet traffic.
What if your organization is in a regulated industry, in which it's sometimes necessary to track some users' Web access? You can do that on your firewall, of course, but generally speaking, it's a bad idea to make a firewall log more than you have to for forensics purposes. This is because logging is I/O-intensive, and too much of it can impact negatively the firewall's ability to fulfill its primary function, assessing and dealing with network transactions. (Accordingly, it's common practice mainly to log “deny/reject” actions on firewalls and not to log “allowed” traffic except when troubleshooting.)
A Web proxy, therefore, provides a better place to capture and record logs of Web activity than on firewalls or network devices.
Another important security function of Web proxies is blacklisting. This is an unpleasant topic—if I didn't believe in personal choice and freedom, I wouldn't have been writing about open-source software since 2000—but the fact is that many organizations have legitimate, often critical, reasons for restricting their users' Web access.
A blacklist is a list of forbidden URLs and name domains. A good blacklist allows you to choose from different categories of URLs to block, such as social networking, sports, pornography, known spyware-propagators and so on. Note that not all blacklist categories necessarily involve restricting personal freedom per se; some blacklists provide categories of “known evil” sites that, regardless of whatever content they're actually advertising, are known to try to infect users with spyware or adware, or otherwise attack unsuspecting visitors.
And, I think a lot of Web site visitors do tend to be unsuspecting. The classic malware vector is the e-mail attachment—an image or executable binary that you trick the recipient into double-clicking on. But, what if you could execute code on users' systems without having to trick them into doing anything but visit a Web page?
Phishing Web sites typically depend on DNS redirection (usually through cache poisoning), which involves redirecting a legitimate URL to an attacker's IP address rather than that site's real IP, so they're difficult to protect against in URL or domain blacklists. (At any rate, none of the free blacklists I've looked at include a phishing category.) Spyware, however, is a common blacklist category, and a good blacklist contains thousands of sites known to propagate client-side code you almost certainly don't want executed on your users' systems.
Obviously, no URL blacklist ever can cover more than a tiny fraction of the actual number of hostile Web sites active at any given moment. The real solution to the problem of hostile Web sites is some combination of client/endpoint security controls, better Web browser and operating system design, and in advancing the antivirus software industry beyond its reliance on virus signatures (hashes of known evil files), which it's been stuck on for decades.
Nevertheless, at this very early stage in our awareness of and ability to mitigate this type of risk, blacklists add some measure of protection where presently there's very little else. So, regardless of whether you need to restrict user activity per se (blocking access to porn and so forth), a blacklist with a well-maintained spyware category may be all the justification you need to add blacklisting capabilities to your Web proxy. SquidGuard can be used to add blacklists to the Squid Web proxy.
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Validate an E-Mail Address with PHP, the Right Way
- A Topic for Discussion - Open Source Feature-Richness?
- New Products
- Dart: a New Web Programming Experience
2 hours 8 min ago
- Keeping track of IP address
3 hours 59 min ago
- Roll your own dynamic dns
9 hours 12 min ago
- Please correct the URL for Salt Stack's web site
12 hours 24 min ago
- Android is Linux -- why no better inter-operation
14 hours 39 min ago
- Connecting Android device to desktop Linux via USB
15 hours 8 min ago
- Find new cell phone and tablet pc
16 hours 6 min ago
17 hours 34 min ago
- Automatically updating Guest Additions
18 hours 43 min ago
- I like your topic on android
19 hours 29 min ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?