Advanced Firewall Configurations with ipset
Each set is of a specific type, which defines what kind of values can be stored in it (IP addresses, networks, ports and so on) as well as how packets are matched (that is, what part of the packet should be checked and how it's compared to the set). Besides the most common set types, which check the IP address, additional set types are available that check the port, the IP address and port together, MAC address and IP address together and so on.
Each set type has its own rules for the type, range and distribution of values it can contain. Different set types also use different types of indexes and are optimized for different scenarios. The best/most efficient set type depends on the situation.
The most flexible set types are iphash, which stores lists of arbitrary IP addresses, and nethash, which stores lists of arbitrary networks (IP/mask) of varied sizes. Refer to the ipset man page for a listing and description of all the set types (there are 11 in total at the time of this writing).
The special set type setlist also is available, which allows grouping several sets together into one. This is required if you want to have a single set that contains both single IP addresses and networks, for example.
Advantages of ipset
Besides the performance gains, ipset also allows for more straightforward configurations in many scenarios.
If you want to define a firewall condition that would match everything but packets from 188.8.131.52 or 184.108.40.206 and continue processing in mychain, notice that the following does not work:
iptables -A INPUT -s ! 220.127.116.11 -g mychain iptables -A INPUT -s ! 18.104.22.168 -g mychain
If a packet came in from 22.214.171.124, it would not match the first rule (because the source address is 126.96.36.199), but it would match the second rule (because the source address is not 188.8.131.52). If a packet came in from 184.108.40.206, it would match the first rule (because the source address is not 220.127.116.11). The rules cancel each other out—all packets will match, including 18.104.22.168 and 22.214.171.124.
Although there are other ways to construct the rules properly and achieve the desired result without ipset, none are as intuitive or straightforward:
ipset -N myset iphash ipset -A myset 126.96.36.199 ipset -A myset 188.8.131.52 iptables -A INPUT -m set ! --set myset src -g mychain
In the above, if a packet came in from 184.108.40.206, it would not match the rule (because the source address 220.127.116.11 does match the set myset). If a packet came in from 18.104.22.168, it would not match the rule (because the source address 22.214.171.124 does match the set myset).
Although this is a simplistic example, it illustrates the fundamental benefit associated with fitting a complete condition in a single rule. In many ways, separate iptables rules are autonomous from each other, and it's not always straightforward, intuitive or optimal to get separate rules to coalesce into a single logical condition, especially when it involves mixing normal and inverted tests. ipset just makes life easier in these situations.
Another benefit of ipset is that sets can be manipulated independently of active iptables rules. Adding/changing/removing entries is a trivial matter because the information is simple and order is irrelevant. Editing a flat list doesn't require a whole lot of thought. In iptables, on the other hand, besides the fact that each rule is a significantly more complex object, the order of rules is of fundamental importance, so in-place rule modifications are much heavier and potentially error-prone operations.
Excluding WAN, VPN and Other Routed Networks from the NAT—the Right Way
Outbound NAT (SNAT or IP masquerade) allows hosts within a private LAN to access the Internet. An appropriate iptables NAT rule matches Internet-bound packets originating from the private LAN and replaces the source address with the address of the gateway itself (making the gateway appear to be the source host and hiding the private "real" hosts behind it).
NAT automatically tracks the active connections so it can forward return packets back to the correct internal host (by changing the destination from the address of the gateway back to the address of the original internal host).
Here is an example of a simple outbound NAT rule that does this, where 10.0.0.0/24 is the internal LAN:
iptables -t nat -A POSTROUTING \ -s 10.0.0.0/24 -j MASQUERADE
This rule matches all packets coming from the internal LAN and masquerades them (that is, it applies "NAT" processing). This might be sufficient if the only route is to the Internet, where all through traffic is Internet traffic. If, however, there are routes to other private networks, such as with VPN or physical WAN links, you probably don't want that traffic masqueraded.
One simple way (partially) to overcome this limitation is to base the NAT rule on physical interfaces instead of network numbers (this is one of the most popular NAT rules given in on-line examples and tutorials):
iptables -t nat -A POSTROUTING \ -o eth0 -j MASQUERADE
This rule assumes that eth0 is the external interface and matches all packets that leave on it. Unlike the previous rule, packets bound for other networks that route out through different interfaces won't match this rule (like with OpenVPN links).
Although many network connections may route through separate interfaces, it is not safe to assume that all will. A good example is KAME-based IPsec VPN connections (such as Openswan) that don't use virtual interfaces like other user-space VPNs (such as OpenVPN).
Another situation where the above interface match technique wouldn't work is if the outward facing ("external") interface is connected to an intermediate network with routes to other private networks in addition to a route to the Internet. It is entirely plausible for there to be routes to private networks that are several hops away and on the same path as the route to the Internet.
Designing firewall rules that rely on matching of physical interfaces can place artificial limits and dependencies on network topology, which makes a strong case for it to be avoided if it's not actually necessary.
As it turns out, this is another great application for ipset. Let's say that besides acting as the Internet gateway for the local private LAN (10.0.0.0/24), your box routes directly to four other private networks (10.30.30.0/24, 10.40.40.0/24, 192.168.4.0/23 and 172.22.0.0/22). Run the following commands:
ipset -N routed_nets nethash ipset -A routed_nets 10.30.30.0/24 ipset -A routed_nets 10.40.40.0/24 ipset -A routed_nets 192.168.4.0/23 ipset -A routed_nets 172.22.0.0/22 iptables -t nat -A POSTROUTING \ -s 10.0.0.0/24 \ -m set ! --set routed_nets dst \ -j MASQUERADE
As you can see, ipset makes it easy to zero in on exactly what you want matched and what you don't. This rule would masquerade all traffic passing through the box from your internal LAN (10.0.0.0/24) except those packets bound for any of the networks in your routed_nets set, preserving normal direct IP routing to those networks. Because this configuration is based purely on network addresses, you don't have to worry about the types of connections in place (type of VPNs, number of hops and so on), nor do you have to worry about physical interfaces and topologies.
This is how it should be. Because this is a pure layer-3 (network layer) implementation, the underlying classifications required to achieve it should be pure layer-3 as well.
Limiting Certain PCs to Have Access Only to Certain Public Hosts
Let's say the boss is concerned about certain employees playing on the Internet instead of working and asks you to limit their PCs' access to a specific set of sites they need to be able to get to for their work, but he doesn't want this to affect all PCs (such as his).
To limit three PCs (10.0.0.5, 10.0.0.6 and 10.0.0.7) to have outside access only to worksite1.com, worksite2.com and worksite3.com, run the following commands:
ipset -N limited_hosts iphash ipset -A limited_hosts 10.0.0.5 ipset -A limited_hosts 10.0.0.6 ipset -A limited_hosts 10.0.0.7 ipset -N allowed_sites iphash ipset -A allowed_sites worksite1.com ipset -A allowed_sites worksite2.com ipset -A allowed_sites worksite3.com iptables -I FORWARD \ -m set --set limited_hosts src \ -m set ! --set allowed_sites dst \ -j DROP
This example matches against two sets in a single rule. If the source matches limited_hosts and the destination does not match allowed_sites, the packet is dropped (because limited_hosts are allowed to communicate only with allowed_sites).
Note that because this rule is in the FORWARD chain, it won't affect communication to and from the firewall itself, nor will it affect internal traffic (because that traffic wouldn't even involve the firewall).
|Speed Up Your Web Site with Varnish||Jun 19, 2013|
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Validate an E-Mail Address with PHP, the Right Way
- Technical Support Rep
- Senior Perl Developer
- UX Designer
- Introduction to MapReduce with Hadoop on Linux
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?