Back from the Dead: Simple Bash for complex DdoS
If you work for a company with an online presence long enough, you'll deal with it eventually. Someone, out of malice, boredom, pathology, or some combination of all three, will target your company's online presence and resources for attack. If you are lucky, it will be a run of the mill Denial of Service (DoS) attack from a single or limited range of IP addresses that can be easily blocked at your outermost point, and the responsible parties will lack the necessary expertise to overcome this relatively simple countermeasure. Your usual script kiddie attack against a site with competent network and server administration is fairly short. If you are unlucky, you'll experience something worse: A small percentage of attacks is from a higher caliber of black hat, and while more difficult to deal with, the individual generally bores easily and moves on.
If you are very, very unlucky, someone highly skilled and just as determined will decide to have some fun with you. If this person decides they want to crack their way into your servers and explore your environment, eventually they will get in and their isn't too much you can do about it. As long as they don't do anything too obvious, like launch a huge dictionary crack attack against other sites from your servers, you may never know, even if you are pretty good and attentive. And if they decide they want to knock you off of the Internet, then down you go.
I had the misfortune to be on the receiving end of such an attack at a previous employer who shall remain nameless (but it was in 2007 and my linkedin is public: http://www.linkedin.com/in/gregbledsoe). Someone didn't seem to like us very much and decided to erase us from online existence. At first it was a standard DoS syn-flood that any script-kiddie could launch, a minor annoyance at best, easily mitigated by blocking the source IP at the point of Ingress. Then it got interesting.
The attacker adapted by engaging a substantial bot-net and it became a distributed denial of service (DdoS) attack. The targeted server address was down briefly until we engaged our carriers to block the inbound attack further out. Still, at that point, the crisis is over, right? Normally, yes. In this case? Not even close.
The attacker adapted the attack *again*, this time seeming to rotate through connections from real bot-net systems and also sending oodles of fake connection requests from random spoofed IP addresses. All told, the number of incoming connection requests was close to a million at a time. This took us down hard. Panic ensued, and after some quick brainstorming a number of mitigation techniques were attempted, all to no avail. The connections went through our firewall, through our load balancer, and hit one of three back-end systems, all of which were overwhelmed dealing with the load imposed by the attack. We tried using rate-limiting on the firewall, and while I'm not sure exactly what they implemented, this took everything behind the firewall down, not just the the targeted URL/server address. The rate limiting statements were taken back out of the configuration but everything stayed down. We discovered that the firewall equipment was out of memory, creating table space to keep track of all the connection attempts. It couldn't tell the difference between spoofed, real, and legitimate tcp SYN connection requests, so it tracked them all and let them through. Apparently the particular equipment we had did not allow more granular rate limiting. Options were discussed, including rejiggering our DNS to send all our traffic through a (very expensive) company that promised to scrub the attack before it reached us. I was skeptical of this idea.
Being the Unix Guy, my domain was the backend servers and to a lesser extent, the load balancer. After watching the output of netstats, lsof -ni's, and tcpdumps for a while, I knew how to defeat this attack. I spent about 10 minutes crafting my counter measure and deployed it on all three back end servers and within seconds our environment was alive again. The red of nagios alarms cleared within a few minutes and our phones stopped ringing. Our total downtime was about an hour.
The thing that I noticed that made this counter measure work was that there was a clear threshold between the number of connections opened by legitimate users, and the high number of connections from both the real and spoofed IPs that were part of the attack. By identifying them on the back-end servers and sending TCP resets (with the RST flag on) back on all those connection requests over the threshold, we could clear out the connection information on the server, the load balancer, and the firewall and free up the memory that had been used to store that entry in the table - clear out enough of them quickly enough, faster than new attack IPs were coming in, and life became good again.
Here is the (very simple) script I ran on all three servers.
#! /bin/bash
while [ 1 ] ;
do
for ip in `lsof -ni | grep httpd | grep -iv listen | awk '{print $8
}' | cut -d : -f 2 | sort | uniq | sed s/"http->"//` ;
# the line above gets the list of all connections and connection
attempts, and produces a list of uniq IPs
# and iterates through the list
do
noconns=`lsof -ni | grep $ip | wc -l`;
# This finds how many connections there are from this particular IP address
echo $ip : $noconns ;
if [ "$noconns" -gt "10" ] ;
# if there are more than 10 connections established or connecting
from this IP
then
# echo More;
# echo `date` "$ip has $noconns connections. Total connections
to prod spider: `lsof -ni | grep httpd | grep -iv listen | wc -l`" >>
/var/log/Ddos/Ddos.log
# to keep track of the IPs uncomment the above two lines and
make sure you can write to the appropriate place
iptables -I INPUT -s $ip -p tcp -j REJECT --reject-with tcp-reset
# for these connections, add an iptables statement to send
resets on any packets recieved
else
# echo Less;
fi;
done
sleep 60
doneOur attacker made a number of attempts to adapt to this solution, trying for instance to have sections of the bot-nets start at some IP, like 1.1.1.1, and send one connection apiece rotating through IPs as quickly as possible to avoid tripping the threshold, but couldn't rotate quickly enough to wreak the same level of havoc as before. This script proved very robust against the rest of his attacks. Some fine-tuning was done, for instance to remove lines after they aged a particular amount, but the essense of the script remained the same.
What I really liked about this solution was the simplicity. I have found that the best solutions are usually the simplest. If you really understand the underlying technology and protocols, then you can often see right through to what underlies a problem, and avoid adding layer after layer of expense and complexity (and corresponding break points) to your environment.
I'm more than willing to release this under the GPL v2. If anyone is interested in incorporating this snippit or concepts into a larger solution for distribution let me know via the email address below.
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- RSS Feeds
- New Products
- Python Programming for Beginners
- Mobile IPv6 with Linux
- Tech Tip: Really Simple HTTP Server with Python
- The Secret Password Is...
- Hey God - You may not be
2 hours 22 min ago - Reply to comment | Linux Journal
4 hours 55 min ago - Drupal is an Awesome CMS and a Crappy development framework
9 hours 34 min ago - IT industry leaders
11 hours 56 min ago - Reply to comment | Linux Journal
1 day 4 hours ago - Reply to comment | Linux Journal
1 day 7 hours ago - Reply to comment | Linux Journal
1 day 8 hours ago - great post
1 day 9 hours ago - Google Docs
1 day 9 hours ago - Reply to comment | Linux Journal
1 day 14 hours ago
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.



Comments
: instead of [ 1 ]
You could use
while :;
instead of
while [ 1 ];
I really believe the
I really believe the important lesson here is that rejecting packets instead of dropping them can help the surrounding network get a hint of what's going on and mitigate the situation, even though dropping the packets may superficially seem more effective (because it does not create any more traffic on an already heavily burdened network, REJECT does).
However, one must pay attention to very large DDoS attacks, where this simple method can fill up the iptables table with lots and lots of rules, adversely affecting CPU and memory usage to the point where the mitigation itself becomes an autoimmune disease. I have yet to see such a case on physical servers whereas OpenVZ containers can easily die because of this, courtesy of the numiptent limit that UBC-based containers heed to.
Another simple script
Very good idea.
Here is something similar that uses netstat to count connections, has configurable white list, and configurable number of connections.
http://deflate.medialayer.com/
Good ideas
Wow. Remarkably similar. Any idea when this was written? (I really want mine to have been first! Come on 2008 or later!)
Reality is most good ideas occur to more than one person at different times... like the debate on who invented the lightbulb first, or calculus...
Bottom line is it works, simple, effective, fairly light-weight. I'll have to take a closer look at deflate, see if I can contribute at all... looks pretty complete though.
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
Different approaches to the same problem
Greg, I can see an advantages to your approach also. Sometimes you would not want to go through the setup and the extra complexity when in an emergency situation. It's good and varied tools that help us be efficient at our jobs.
The only real difference
The only real difference is that I switched to --reject-with-tcp-reset while he uses --DROP. --DROP leaves the record and memory usage in place on stateful network gear for the connections - I would say that is a slight "slight" advantage to my solution in some circumstances, but easy to change in Ddos.sh.
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
He did it earlier... :)
This is reported the ddos.sh file:
echo "DDoS-Deflate version 0.6"
echo "Copyright (C) 2005, Zaf "
Ciao.
Well darn
I didn't see that in my first cursory look over it, since it isn't at the top of the file.
Then I guess the kudo's are yours Zak. You did it first. Darn you. :-D
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
Very Clever, Understanding What Endpoints are being attacked
Hi Greg:
I must say it's rare to find the correct solution to defend against Web DdoS Attacks.
Your Solution is Simple, Effective and Very Clever, KUDO's !!!
To defend against Web DdoS in a Panic/Crisis Mode, Most Folks ultimately get their ISP's involved upstream. ISP's can/will cause blockage to legitimate Business Services, causing hundreds of help desk phone calls, exactly what the Attacker wanted to accomplish.
Preventing the Ddos Attack at the correct Endpoint, Web/Application Servers exposed on Public Internet, is by far the best solution to issue.
Can You Please Share Your Nightly Cron Job that You ran to remove stale Reject Rules ?
I am a Linux/Bash Newbie learning more things everyday. Doe's Your Bash Script run every 60 seconds continuously, until Servers are rebooted ?
Regards,
Bruce C
It *was* 2007
I didn't save that cron job -- but it really shouldn't be too terribly difficult to replicate. If I can squeak some time out of my day I'll take a stab at it.
I would run the script with a "nohup [command] &" which would only stop if killed specifically by pid or name, or with a reboot. Reboot seems like overkill though. "kill [pid]" should do it.
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
May be a filter in application program can do the job
I have written a filter for tomcat, that count parallel connections from the same ip. If the counter reaches a threshold, it shutdown the connection with "shutdown(fd, SHUT_WR)", so that the server will send back a RST. I also took samples for memory usage, if a request is pending and memory is not enough, drop it.
Interesting.
I'd like to see that, too! That could certainly work in certain circumstances.
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
Absolutely beautiful
This is the kind of solution I love to use. I have learned to love bash, it saves a lot of time and money. This script is definitely going to my toolbox.
Thanks!
And, I concur wholeheartedly!
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
2 lines
Wouldn't something like :
iptables -I INPUT -p tcp --dport 80 -i eth0 -m state --state NEW -m recent --set
iptables -I INPUT -p tcp --dport 80 -i eth0 -m state --state NEW -m recent --update --seconds 60 --hitcount 10 -j DROP
do the job without the actual script?
-
Alex
Something else occurred to me about this
These iptables statements will let every IP continue to send *up to* 10 connection requests per minute. That wouldn't really have helped us with the number of IPs being used in the attack - we needed to identify and then reject and clear *all* connection attempts from the "bad" IPs. I looked over the current man page for IP tables and don't see a way to do that without some scripting.
But I appreciate you provoking me to look! Always learning. :-)
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
Almost
But not quite. I've had quite a bit of problems getting --update and -hitcount to work together correctly, or more properly - as I expect it to. Its entirely possible that the issues I encountered are no longer relevant - but I've not tested it recently. Second - what your iptables lines will catch is connection requests in 60 seconds, what the script catches is simultaneous connections outstanding - a slight difference but meaningful, and could, in the right circumstances, make all the difference.
As an aside - DROP isn't what you want in this case. DROP leaves the tracking burden on all the stateful gear between you and the endpoint - which doesn't fix the problem.
Good suggestion though!
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
Good
Great minds think alike.
Good way to learn about bash and networking
Hi Greg,
Thanks for the article, I plan to work through the script as an exercise to improve my knowledge of bash and networking, particularly the lsof command.
To add my 2 cents, similar to Pablo, if you ever need the script again I believe that it is more efficient to use grep -c rather than piping to wc -l (I think I read that in another LJ article??). Probably a negligible improvement but hey why not? :)
Thanks again,
Ewen
Not a bad point either
Its entirely possible grep is faster at counting lines - this isn't something I've tested personally - though it seems (uneducated guess alert) that grep is optimized for searching while wc is optimized for counting. I'd suspect wc was more resource efficient - though I could very well be wrong.
Now I will be irresistibly drawn to test it and unable to sleep until I do. Thanks! ;-P
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
Sorry, make myself clearer
Sorry, I meant not using grep -c as a drop in replacement for wc -l but being used in the example of:
noconns=`lsof -ni | grep $ip | wc -l`;
replace with:
noconns=`lsof -ni | grep -c $ip`;
As grep has already done all of the searching work anyway.
Cheers,
Ewen
This might make an interesting article in itself
ie, how to get the answer to a "which is faster or more efficient" question when it comes to bash scripting. Using time, I found that, as i suspected, wc is *much much much* faster than grep -c, but that excludes time for subshell spawning that would be involved in piping.
Generally, bash built-ins and one-shot single-purpose commands are way faster at what they do than the big commands that are swiss-army-style utilities like awk, sed, or grep. cut is faster than tr, and tr is faster than sed, and sed is faster than awk, etc. But adding in piping and associated overhead muddies the picture a little.
Maybe I *will* write that article. :-D
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
How did you test?
I found that using grep -c was faster than piping to wc. Maybe because of what you mentioned with piping overhead and what not. I agree that wc would be faster than grep, but if you have to use grep anyway to perform the search, may as well just use it to count?
I tried this:
mybigfile.txt is 881M and just created by cat'ing /usr/share/dict files together a bunch of times.
$ time grep -c a mybigfile.txt
50224464
real 0m8.251s
user 0m8.110s
sys 0m0.120s
$ time grep a mybigfile.txt | wc -l
50224464
real 0m10.991s
user 0m11.610s
sys 0m0.320s
So basically what it comes down to is yes that would be an interesting article and I'd like to read it :)
Intruiging
I wish I had further time to do more testing. I'd be interested to see if versions made a difference, and the complexity of the grep. :-D
I just put it on my list.
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
I hear ya
I still need to test though. :-D I suspect it'll be a close call between wc keeping a cumulative count vs grep tracking it on the way through. :-D
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
Very nice script, thanks for sharing but...
...I have a doubt. The rules you added with repeated commands like:
iptables -I INPUT -s $ip -p tcp -j REJECT --reject-with tcp-resetreset all the connections coming from a pool of IPs you have previously selected as potential attacker IPs (spoofed and non spoofed).
How many IPs are we talking about? you should have some numbers (/var/log/Ddos/Ddos.log).
If this IPs are a lot (especially the spoofed ones), using the iptables rule you are potentially blocking also common users that, after the DDOS is over, are trying to hit your web servers...
Do you delete the iptables rules after a while?
Just asking here because I am very interested at fully understanding your bash script.
Cheers,
ztank
Very good point
That is an excellent question. Thanks for asking! In fact, I ran a nightly cron job that removed reject rules that hadn't been hit in a certain amount of time. We tuned that out of exactly the concern you raise, blocking actual users, but eventually proclaimed it "good enough" when we only had one complaint over several days from an actual user that couldn't reach us, which we tracked back to an iptable rule.
Again, great question!
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
DDOS
We had a DDOS against an asterisk box (It was directly on the net, because we could not convince managers to do it differently.
We used fail2ban and it could not keep up.
We then started blocking entire ranges of the internet, leaving the US IP ranges in the end and letting F2B handle that.
Your script might have worked for us. F2B waits for failed authentication and you are looking at active connections.
Dropping script in my gmail tools folder. Nice idea.
Thanks,
Dave
Great!
Feel free to use it if you need to Dave. Would appreciate if you feedback any improvements though. :-)
F2B is really for a different kind of problem, more of a crack attempt kind of attack. I've also had bad luck trying to block whole geographical regions, as ISP's have a way of shifting blocks around unpredictably as IPv4 space availability tightens.
Glad to put a new tool in the toolbox!
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
A collection of 50 or so of
A collection of 50 or so of these scenarios, with do-it-yourself replicable code, might make for a very readable book on the subject.
Any book recommendations for how I can get up to understanding all this in the meantime?
~3
Now there's an idea.
This particular script requires a pretty solid understanding of both basic networking and basic bash scripting. I would suggest start with some resources designed to get someone up to the CCNA level or equivalent (not necessarily cisco focused) and some bash scripting tutorials, like go from:
http://www.freeos.com/guides/lsst/
to:
http://tldp.org/LDP/abs/html/
That should keep you busy for a while!
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king
Great idea that shows the value of havieng experienced people
Hi, Greg:
That's a great idea and it shows the value of having experienced people on board when you run into problems.
Regarding the script you could probably use 'uniq -c' instead of 'uniq' to make the count at the same time than the list and avoid having to run lsof so many times.
Kudos to your scripting and networking abilities,
Pablo
That would certainly have been more efficient!
But under the gun, I just didn't think of it. :-) Thanks for the suggestion! If I ever need to use this again (may it never be!) I'll include your suggestion!
--
I was cloud before cloud was cool. Not in the sense of being an amorphous collection of loosely related molecules with indeterminate borders -- or maybe I am. Holla @geek_king, http://twitter.com/geek_king