Using an SMS Server to Provide a Robust Alerting Service for Nagios
I initially had multiple overlapping “Directory” statements in the Nagios section of the Apache configuration file. The net effect was a “Permission denied” when the CGI was being run. I figured this out by using the method described below and by looking at the Apache access_log and error_log files.
If you think there is some communication problem with the script, you can monitor the traffic between Nagios, Apache and the iSMS by listening on the network. I used tcpdump to capture the HTTP traffic and see error messages:
% tcpdump -v -s 0 -w /tmp/cap host 192.168.1.50
In this example, I used the -v option for verbose output, the -s 0 option to capture as much of the packet as possible and the -w option to write the captured traffic to the /tmp/cap file. The “host” keyword indicates that I want all traffic to and from the IP address of the iSMS (192.168.1.50). I ran this command on the machine hosting both Nagios and Apache, so it should see all communication between these services and the iSMS. I then generated some SMS messages traffic by causing Nagios to send out a “PROBLEM” message, which I then acknowledged via my mobile phone. You should see the number following “Got” incrementing as packets are being captured:
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes Got 22
I then interrupted the capture and converted the captured data to plain text:
% tcpdump -A -r /tmp/cap > /tmp/txt
The -A option writes out ASCII text, and the -r option reads capture data from a file. Examining the /tmp/txt file allows you to see the entire HTTP transaction between Nagios, the iSMS and the CGI script:
12:50:09.434851 IP nagios.46058 > smsgw.http: P 1:266(265) ack 1 win 46 <nop,nop,timestamp 2801435752 1987587011> GET /sendmsg?user=nagios&passwd=secret...text=ACKNOWLEDGEMENT... 12:50:09.501524 IP smsgw.http > nagios.46058: P 1:29(28) ack 266 win 6432 <nop,nop,timestamp 1987587017 2801435752> HTTP/1.0 200 OK ID: 2078
In this capture, you can see that the sendsms.pl script invoked by Nagios (hostname nagios) has sent an HTTP GET to the iSMS (hostname smsgw) containing the Nagios “ACKNOWLEDGEMENT” message. The “ID: 2078” response from the iSMS back to Nagios indicates that the message has been queued for sending and that the ID for this SMS message is 2078. You also might note that the user name and password for the iSMS “nagios” account is being sent in the clear—not perfect, but I think this is a pretty low security risk, as this transaction is internal to the company network.
My original iSMS came with version 1.20 firmware. This worked fine with the original Perl script, but it had a problem in that it was somewhat “single user”. For example, if you happened to be logged in to the iSMS Web user interface while the check_smsfinder.pl script ran, it would return a bad status, and Nagios would create an alert for the device. Upgrading to the newer firmware fixed this problem, but broke the check_smsfinder.pl script. The Perl script has been updated, but the version of the script now is tied to the firmware version running on the iSMS.
Because this is a Perl script, it can be modified easily. If you do not like the format of the message being sent out by Nagios, you can change this in the Nagios “command” definition—for example, “notify-host-by-sms” and also change the Perl script to parse whatever format of a message you want to send back from your phone. The script authors have changed their message format over time to make it easier to parse, as problems were discovered with whitespace in service names and host alerts that would change format depending on whether the host definition contained an IP address (such as in the case of DHCP clients).
I am very pleased with how this alerting service has worked out. The iSMS has been solid since the moment I installed it, and the associated script has worked perfectly once I tweaked my Nagios setup to match it. I have high confidence that I will get alerts regardless of the nature of the problem.
Thanks to Birger Schmidt and his colleagues from NETWAYS GmbH (www.netways.de) for writing the original script, updating it and taking the time to review this article, and to Chris Reilley (www.reilleydesign.com) for Figure 2.
Eric Pearce is the IT Lead for AmberPoint, Inc., an Application Management and Governance software company based in Oakland, California. He has authored several books on UNIX and Windows system administration for O'Reilly & Associates.
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
|Speed Up Your Web Site with Varnish||Jun 19, 2013|
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
- Speed Up Your Web Site with Varnish
- Containers—Not Virtual Machines—Are the Future Cloud
- Real Programming with AWK
- Examining Load Average
- Pluggable Authentication Modules for Linux
- One-Click Release Management
- Non-Linux FOSS: libnotify, OS X Style
- Python Programming for Beginners
- Boot with GRUB
- DreamWorks Feature Linux and Animation