Paranoid Penguin - Building a Secure Squid Web Proxy, Part I
Just How Intelligent Is a Web Proxy?
You should be aware of two important limitations in Web proxies. First, Web proxies generally aren't very smart about detecting evil Web content. Pretty much anything in the payloads of RFC-compliant HTTP and HTTP packets will be copied verbatim from client-proxy transactions to proxy-server transactions, and vice versa.
Blacklists can somewhat reduce the chance of your users visiting evil sites in the first place, and content filters can check for inappropriate content and perhaps for viruses. But, hostile-Web-content attacks, such as invisible iframes that tell an attacker's evil Web application which sites you've visited, typically will not be detected or blocked by Squid or other mainstream Web proxies.
Note that enforcing RFC compliance is nothing to sneeze at. It constitutes a type of input validation that could mitigate the risk of certain types of buffer-overflow (and other unexpected server response) attacks. But nonetheless, it's true that many, many types of server-side evil can be perpetrated well within the bounds of RFC-compliant HTTP messages.
Second, encrypted HTTPS (SSL or TLS) sessions aren't truly proxied. They're tunneled through the Web proxy. The contents of HTTPS sessions are, in practical terms, completely opaque to the Web proxy.
If you're serious about blocking access to sites that are inappropriate for your users, blacklisting is an admittedly primitive approach. Therefore, in addition to blacklists, it makes sense to do some sort of content filtering as well—that is, automated inspection of actual Web content (in practice, mainly text) to determine its nature and manage it accordingly. DansGuardian is an open-source Web content filter that even has antivirus capabilities.
What if you need to limit use of your Web proxy, but for some reason, can't use a simple source-IP-address-based Access Control List (ACL)? One way to do this is by having your Web proxy authenticate users. Squid supports authentication via a number of methods, including LDAP, SMB and PAM. However, I'm probably not going to cover Web proxy authentication here any time soon—802.1x is a better way to authenticate users and devices at the network level.
Route-limiting, logging, blacklisting and authenticating are all security functions of Web proxies. I'd be remiss, however, not to mention the main reason many organizations deploy Web proxies, even though it isn't directly security-related—performance. By caching commonly accessed files and Web sites, a Web proxy can reduce an organization's Internet bandwidth usage significantly, while simultaneously speeding up end-users' sessions.
Fast and effective caching is, in fact, the primary design goal for Squid, which is why some of the features I've discussed here require add-on utilities for Squid (for example, blacklisting requires SquidGuard).
Suppose you find all of this very convincing and want to use a Web proxy to enforce blacklists and conserve Internet bandwidth. Where in your network topology should the proxy go?
Unlike a firewall, a Web proxy doesn't need to be, nor should it be, placed “in-line” as a choke point between your LAN and your Internet's uplink, although it is a good idea to place it in a DMZ network. If you have no default route, you can force all Web traffic to exit via the proxy by a combination of firewall rules, router ACLs and end-user Web browser configuration settings. Consider the network shown in Figure 2.
In Figure 2, Firewall 1 allows all outbound traffic to reach TCP port 3128 on the proxy in the DMZ. It does not allow any outbound traffic directly from the LAN to the Internet. It passes only packets explicitly addressed to the proxy. Firewall 2 allows all outbound traffic on TCP 80 and 443 from the proxy (and only from the proxy) to the entire Internet.
Because the proxy is connected to a switch or router in the DMZ, if some emergency occurs in which the proxy malfunctions but outbound Web traffic must still be passed, a simple firewall rule change can accommodate this. The proxy is only a logical control point, not a physical one.
Note also that this architecture could work with transparent proxying as well, if Firewall 1 is configured to redirect all outbound Web transactions to the Web proxy, and Firewall 2 is configured to redirect all inbound replies to Web transactions to the proxy.
You may be wondering, why does the Web proxy need to reside in a DMZ? Technically, it doesn't. You could put it on your LAN and have essentially identical rules on Firewalls 1 and 2 that allow outbound Web transactions only if they originate from the proxy.
But, what if some server-side attacker somehow manages to get at your Web proxy via some sort of “reverse-channel” attack that, for example, uses an unusually large HTTP response to execute a buffer-overflow attack against Squid? If the Web proxy is in a DMZ, the attacker will be able to attack systems on your LAN only through additional reverse-channel attacks that somehow exploit user-initiated outbound connections, because Firewall 1 allows no DMZ-originated, inbound transactions. It allows only LAN-originated, outbound transactions.
In contrast, if the Web proxy resides on your LAN, the attacker needs to get lucky with a reverse-channel attack only once and can scan for and execute more conventional attacks against your internal systems. For this reason, I think Web proxies are ideally situated in DMZ networks, although I acknowledge that the probability of a well-configured, well-patched Squid server being compromised via firewall-restricted Web transactions is probably low.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?





4 hours 22 min ago
4 hours 39 min ago
6 hours 30 min ago
12 hours 22 min ago
16 hours 53 min ago
16 hours 54 min ago
18 hours 54 min ago
1 day 3 hours ago
1 day 4 hours ago
1 day 5 hours ago