Own Your DNS Data

I honestly think most people simply are unaware of how much personal data they leak on a daily basis as they use their computers. Even if they have some inkling along those lines, I still imagine many think of the data they leak only in terms of individual facts, such as their name or where they ate lunch. What many people don't realize is how revealing all of those individual, innocent facts are when they are combined, filtered and analyzed.

Cell-phone metadata (who you called, who called you, the length of the call and what time the call happened) falls under this category, as do all of the search queries you enter on the Internet.

For this article, I discuss a common but often overlooked source of data that is far too revealing: your DNS data. You see, although you may give an awful lot of personal marketing data to Google with every search query you type, that still doesn't capture all of the sites you visit outside Google searches either directly, via RSS readers or via links your friends send you. That's why the implementation of Google's free DNS service on 8.8.8.8 and 8.8.4.4 is so genius—search queries are revealing, but when you capture all of someone's DNS traffic, you get the complete picture of every site they visit on the Internet and beyond that, even every non-Web service (e-mail, FTP, P2P traffic and VoIP), provided that the service uses hostnames instead of IP addresses.

Let me back up a bit. DNS is one of the core services that runs on the Internet, and its job is to convert a hostname, like www.linuxjournal.com, into an IP address, such as 76.74.252.198. Without DNS, the Internet as we know it today would cease to function, because basically every site we visit in a Web browser, and indeed, just about every service we use on the Internet, we get to via its hostname and not its IP. That said, the only way we actually can reach a host on the Internet is via its IP address, so when you decide to visit a site, its hostname is converted into an IP address to which your browser then opens up a connection. Note that via DNS caching and TTL (Time To Live) settings, you may not have to send out a DNS query every time you visit a site. All the same, these days TTLs are short enough (often ranging between one minute to an hour or two—www.linuxjournal.com's TTL is 30 minutes) that if I captured all your DNS traffic for a day, I'd be able to tell you every Web site you visited along with the first time that day you visited it. If the TTL is short enough, I probably could tell you every time you went there.

Most people tend to use whatever DNS servers they have been provided. On a corporate network, you are likely to get a set of DNS servers over DHCP when you connect to the network. This is important because many corporate networks have internal resources and internal hostnames that you would be able to resolve only if you talked to an internal name server.

______________________

Kyle Rankin is VP of engineering operations at Final, Inc., the author of many books including Linux Hardening in Hostile Networks, DevOps Troubleshooting and The Official Ubuntu Server Book, and a columnist for Linux Journal. Follow him @kylerankin