Paranoid Penguin - DNS Cache Poisoning, Part I
Few recent Internet threats have made such a big impact as security researcher Dan Kaminsky's discovery, in 2008, of fundamental flaws in the Domain Name System (DNS) protocol that can be used by attackers to redirect or even hijack many types of Internet transactions. The immediate response by DNS software providers was to release software patches that make the problematic “DNS cache poisoning” attacks more difficult to carry out, and this certainly helped.
But, the best fix is to use DNSSEC, a secure version of the DNS protocol that uses x.509 digital certificates validated through Public Key Infrastructure (PKI) to protect DNS data from spoofing. Slowly but surely, DNSSEC is being deployed across key swaths of Internet infrastructure.
What does DNS cache poisoning mean, and how does it affect you? How can you protect your users from attacks on your organization's nameserver? The next few months, I'm going to explore DNS cache poisoning and DNSSEC in depth, including how DNS queries are supposed to work, how they can be compromised, and how they can be protected both in general and specific terms.
I'm not going to attempt to cover all aspects of DNS server security, like in Chapter Six of my book Linux Server Security (see Resources). Armed with the next few months' columns, however, I hope you'll understand and be able to defend against cache poisoning, a particular but very nasty DNS vulnerability.
As seems to be the pattern with these multiple-installment extravaganzas, I'm going to start out at a general, less-hands-on level, and enter increasingly technical levels of detail as the series progresses. With that, let's talk about how DNS is supposed to work and how it can break.
The Domain Name System is both a protocol and an Internet infrastructure for associating user-friendly “names” (for example, www.linuxjournal.com) with networks and computers that are, in fact, known to each other and to network infrastructure devices by their Internet Protocol (IP) addresses (for example, 18.104.22.168).
Sounds simple enough, right? Perhaps it would be, if the Internet wasn't composed of thousands of different organizations, each needing to control and manage its own IP addresses and namespaces. Being such, the Internet's Domain Name System is a hierarchical but highly distributed network of “name authorities”—that is, DNS servers that are “authoritative” only for specific swaths of namespace.
Resolving a host or network/domain name to an IP address, therefore, is a matter of determining which name authority knows the answer to your particular question. And, as you'll see shortly, it's extremely important that you can trust the answer you ultimately receive. If you punch the name of your bank's on-line banking site into your Web browser, you don't want to be sent to a clever clone of online.mybank.com that behaves just like the real thing but with the “extra feature” of sending your login credentials to an organized crime syndicate; you want to be sent to the real online.mybank.com.
The security challenge in DNS lookups (also called queries) is, therefore, to ensure that an attacker can't tamper with or replace DNS data. Unfortunately, the DNS protocol was designed with no rigorous technical controls for preventing such attacks.
But, I'm getting ahead of myself! Let's dissect a DNS lookup to show what happens between the time you type that URL into your browser and the time the page begins to load.
Your Web browser doesn't actually interact with authoritative nameservers. it passes the question “what's the IP address of online.mybank.com?” to your computer's local “stub resolver”, a part of the operating system. Your operating system forwards the query to your local network's DNS server, whose IP address is usually stored, on UNIX and UNIX-like systems, in the file /etc/resolv.conf (although this often is just a copy of data stored in some other network configuration script or file or of configuration data received from a DHCP server).
That local nameserver, which in practice is run either by your organization's Information Technology department or by your Internet Service Provider, then does one of two things. If it already has resolved online.mybank.com reasonably recently, it sends your browser the query results from its “cache” of recently resolved names. If online.mybank.com isn't in its cache, it will perform a recursive query on your behalf.
Recursive queries generally take several steps, illustrated in Figure 1. In our example, the recursing DNS server first randomly selects the IP address of one of the Internet's “root” nameservers from a locally stored list (every DNS server has this list; it isn't very long and seldom changes). It asks that root nameserver for the IP address of online.mybank.com.
The root nameserver replies that it doesn't know, but it refers the recursing nameserver to an authoritative nameserver for the .com top-level domain (TLD)—in our example, the fictional host dotcom.au7h.com. The root nameserver also provides this host's IP address (22.214.171.124). These two records, the NS record referring dotcom.au7h.com as an authority for .com and the A record providing dotcom.au7h.com's IP address, are called glue records.
The recursing nameserver then asks dotcom.au7h.com if it knows the IP address for online.mybank.com. It too replies that it doesn't know, but it refers the recursing nameserver to another nameserver, ns.mybank.com, which is authoritative for the mybank.com domain. It also provides that host's IP address (126.96.36.199).
Finally, the recursing nameserver asks ns.mybank.com whether it knows the IP address for online.mybank.com. Yes, it does: ns.mybank.com replies with the requested IP address, and the recursing nameserver forwards that information back to the end user's stub resolver, which in turn provides the IP address to the end user's Web browser.
In this example, then, the simple query from your stub resolver results in three queries from your local recursing DNS server, representing queries against root, the .com TLD and, finally, the mybank.com name domain. The results from all three of these queries are cached by the local DNS server, obviating the need for your server to pester authoritative nameservers for .com and .mybank.com until those cache entries expire.
That expiration time is determined by each cached record's Time to Live (TTL) value, which is specified by whatever authoritative nameserver provides a given record. A records that map IPs to specific hosts tend to have relatively short TTLs, but NS records that specify authoritative nameservers for entire domains or TLDs tend to have longer TTLs.
I've described how DNS query recursion is supposed to work. How can it be broken?
- Handling the workloads of the Future
- Readers' Choice Awards 2014
- diff -u: What's New in Kernel Development
- How Can We Get Business to Care about Freedom, Openness and Interoperability?
- December 2014 Issue of Linux Journal: Readers' Choice
- Synchronize Your Life with ownCloud
- Non-Linux FOSS: Don't Type All Those Words!
- Days Between Dates?
- Computing without a Computer