Paranoid Penguin - DNS Cache Poisoning, Part I

Understand and defend against DNS cache poisoning.
DNS Cache Poisoning

Two things should be fairly obvious to you by now. First, DNS is an essential Internet infrastructure service that must work correctly in order for users to reach the systems with which they wish to interact. Second, even a simple DNS query for a single IP address can result in multiple network transactions, any one of which might be tampered with.

Relying, as it does, on the “stateless” UDP protocol for most queries and replies, DNS transactions are inherently prone to tampering, packet-injection and spoofing. Tampering with the reply to a DNS query, on a local level, is as simple as sending spoofed packets to the “target” system making the query and hoping they arrive before the query's “real” answer does.

Spoofing a DNS reply being sent from a recursing DNS server to a client system impacts only that one client system's users. What if you could instead tamper with the recursive nameserver's queries, injecting false data into its cache and, thus, affecting the DNS queries of all computers that use that DNS server?

And, what if, instead of tampering strictly with individual A records describing the IPs of individual hosts, you could inject fraudulent NS records that redirect DNS queries to your (fraudulent) nameserver, potentially impacting an entire name domain?

When security researcher Dan Kaminsky discovered fundamental flaws in the DNS protocol in 2008, these were the very attack scenarios he identified. Before you get too panicky, I'm going to give a little spoiler, and say that even in 2008, before he gave his now-renowned Black Hat presentation on these attacks, Kaminsky worked with DNS server software vendors, such as ISC and Microsoft, to release urgent patches that at least partially mitigated this risk before Kaminsky's attack became widely known.

But, the attack has been only partially mitigated by patching. Because this is such an important, widespread and interesting issue, let's explore Kaminsky's DNS cache poisoning attack in depth.

All the transactions comprising the DNS query in Figure 1 use UDP, which I've said is easily spoofed. So, what's to prevent an attacker from sending fraudulent replies to any one of those transactions?

Before 2008, the answer to this question was twofold: Query IDs and bailiwick checking. Every DNS query packet contains a Query ID, a 16-bit number that must be included in any reply to that query. At the very least, Query IDs help a recursive DNS server that may have numerous, concurrent queries pending at any given time to correlate replies to the proper queries as they arrive, but the Query ID also is supposed to make it harder to spoof DNS replies.

Bailiwick is, here, a synonym for “relevance”. Any glue records included in a DNS reply must be relevant to the corresponding query. Therefore, if an attacker attempts to poison a recursing DNS server's cache via a “Kashpureff attack” (see the Cricket Liu interview in the Resources section) in which extraneous information is sent via glue records to a recursing DNS server that has been tricked into making a query against a hostile nameserver, the attack will succeed only if the recursing nameserver fails to perform bailiwick checking that correlates those glue records to the query.

For example, if I can get a recursing DNS server to look up the name, and I control the name domain, I could send a reply that includes not only the requested IP, but “extra” A records that point, and other sites whose traffic I wish to intercept using impostor servers.

Those fake A records will replace any records for those hosts already cached by the target recursing nameserver. However, bailiwick checking has been a standard, default feature for practically all DNS server software since 1997, so the Kashpureff attack is largely obsolete (insofar as any historical TCP/IP attack ever is).

So to review, Query IDs are supposed to prevent reply spoofing, and bailiwick checking is supposed to prevent weirdness with glue records.

Yet, Kaminsky discovered that despite Query IDs and bailiwick checking, it nonetheless was possible both to spoof DNS replies and abuse glue records and, thus, to poison the caches of most recursing nameservers successfully. Here's how Kaminsky's attack works.

The object of this attack is to poison a recursing DNS nameserver's cache with fraudulent A records (for individual hosts) or even fraudulent NS records (for entire domains). In the example I'm about to use, the objective will be to inject a fraudulent A record for the host

This will be achieved by either initiating, or tricking some other host served by the recursing nameserver into initiating, a flood of queries against random, presumably nonexistent hostnames in the same name domain as that of the host whose name we wish to hijack. Figure 2 shows an attacker sending a flood of queries for hostnames, such as, and so forth.

Figure 2. Kaminsky's Cache Poisoning Attack

Besides the fact that it's convenient to generate a lot of them, querying randomized/nonexistent hostnames increases the odds that the answers aren't already cached. Obviously, if you send a query for some host whose IP already is in the recursing nameserver's cache, that nameserver will send you the IP in question without making any recursive queries. Without recursive queries, there are no nameserver replies to spoof!

Almost concurrently with sending the queries, the attacker unleashes a flood of spoofed replies purporting to originate from that name domain's authoritative nameserver (in Figure 2, There are several notable things about these replies.

First, also as shown in Figure 2, they do not provide answers to the attacker's queries, which as you know concern nonexistent hosts anyhow. Rather, they refer the recursing nameserver to another “nameserver”,, conveniently offering its IP address as well (which, of course, is actually the IP address of an attacker-controlled system).

The whole point of these queries is to provide an opportunity to send glue records that pass bailiwick checking but are nonetheless fraudulent. If you're trying to hijack DNS for an entire domain, in which case you'd spoof replies to queries against a Top-Level Domain authority, such as for .com, you'd send glue records pointing to a hostile DNS server that could, for example, send fraudulent (attacker-controlled) IPs for popular on-line banking and e-commerce sites, and simply recurse everything else.

In the example here, however, the attacker instead is using the pretense of referring to a different nameserver, in order to plant a fake Web server's IP address into the target recursing nameserver's cache. The fact that this fake Web server doesn't even respond to DNS queries doesn't matter; the attacker wants on-line banking traffic to go there.

The second notable thing about the attacker's spoofed replies (and this is not shown in Figure 2), is that each contains a different, random Query ID. The reason for sending a flood of queries and a flood of replies is to maximize the chance that one of these reply's Query IDs will match that of one of the corresponding recursed queries that the targeted recursing nameserver has initiated to

And, this is arguably the most important aspect of Kaminsky's attack. By simultaneously making multiple guesses at the Query IDs of multiple queries, the attack takes advantage of the “birthday problem” to improve the chances of matching a spoofed reply to a real query. I'll resist the temptation to describe the birthday problem here (see Resources), but suffice it to say, it's a statistical principle that states that for any potentially shared characteristic, the odds of two or more subjects sharing that characteristic increases significantly by increasing the population of subjects even slightly.

Thus, even though the odds are 65,534 to 1 against an attacker guessing the correct Query ID of a single DNS query, these odds become exponentially more favorable if the attacker attempts multiple queries, each with multiple fake replies. In fact, using a scripted attack, Kaminsky reported success in as little as ten seconds!

Yet another thing not shown in Figure 2 is the TTL for the fraudulent glue A records in the attacker's spoofed replies. The attacker will set this TTL very high, so that if the attack succeeds, the victim nameserver will keep the fraudulent A record in its cache for as long as possible.

The last thing to note about this attack is that it will fail if none of the spoofed replies matches a query, before manages to get its real reply back to the recursing nameserver. Here again, initiating lots of simultaneous queries increases the odds of winning at least one race with the real nameserver, with a reply containing a valid Query ID.