Using an SMS Server to Provide a Robust Alerting Service for Nagios

How to implement a Nagios-to-SMS service.
Debugging Installation and Runtime Problems

I was able to get everything running in a day or two, but I did have to resolve several issues as part of the installation. I also discovered several problems that required changes to the Perl script. Therefore, it's important to test the scripts.

You can run the check_smsfinder.pl and sendsms.pl scripts on the command line to view their output directly. For example:

% /usr/local/nagios/libexec/check_smsfinder.pl \
     -H 192.168.1.50 -u nagios -p secret
OK: GSM signal strength is 100.0% - \
    model: SF100-G - \
    firmware: 1.31|loginID=1607132337 strength=100.0%;40;20;;

% /usr/local/nagios/smsack/sendsms.pl \
     --noma -H 192.168.1.50 -u nagios -p secret \
     -n 14155551212 -m 'this is a SMS from nagios'
"this%20is%20a%20SMS%20from%20nagios" to 14155551212 \
     via 192.168.1.50 send successfully. MessageID: 37

The smsack.cgi script is a little harder to debug than the command-line scripts, but the usual Apache log files access_log and error_log are useful in that they will contain the HTTP response codes when the CGI is invoked by the iSMS. You also can use the method described below under “Network Capture” to look for problems with the CGI script.

Logging

In many places within Nagios, the Perl script and the iSMS device contain debugging information. Knowing where those are will help you with your installation.

The iSMS can send helpful debugging messages to a remote host via syslog. The Nagios server would be an ideal destination for the messages, as all logging can be consolidated in once place. The remote syslog host is specified in the iSMS Web GUI. The iSMS syslog messages use the LOG_LOCAL0 facility. I added a local0.* /var/log/isms entry to my /etc/syslog.conf file to capture all messages. The log file will record all SMS messages sent and received by the iSMS, for example:

Nov 23 09:27:59 smsgw MultiModemiSMS modem: sentlog:
    [SENT TO] : 14155551212 : [MSG] : this is a SMS from Nagios

The log also contains any authentication failures. This is useful because the check_smsfinder.pl and sendsms.pl scripts authenticate themselves to the iSMS every time they run.

The iSMS has a concept of an “Inbox” for SMS messages received from mobile users and an “Outbox” for SMS messages being sent out from the iSMS. You can examine these boxes via the iSMS Web interface to find out whether a message actually was received or transmitted.

Nagios logs to the file nagios.log, which is typically found in the /usr/local/nagios/var directory. You can use this log to verify that Nagios is generating an alert for a problem and that a command has been used to send an SMS (notify-host-by-sms):

[1258664139] HOST NOTIFICATION:
    epearce-sms;mailserv2;DOWN;notify-host-by-sms;CRITICAL -
    Host Unreachable (192.168.1.250)

The Nagios log also will show the results of smsack.cgi running after getting the “ACK” back from a mobile user:

[1258500602] EXTERNAL COMMAND:
    ACKNOWLEDGE_HOST_PROBLEM;mailserv2;1;1;1;14155551212;
    Acknowledged by 14155551212 at 09/11/17 15:29:57
    ACK PROBLEM mailserv2> is DOWN /11-17-2009 15:28:21/ CRITICAL -
    Host Unreachable(192.168.1.250)

The smsfinder scripts log to smsfinder.log (also in the Nagios var directory). This file will contain debugging information for the sendsms.pl and smsack.cgi uses of the script. The lines containing “SMSsend” show the status of sendsms.pl when it is being run by Nagios. For example:

2009/11/19 12:55:39 SMSsend:
    "PROBLEM...mailserv...is...DOWN...CRITICAL..."
    to 14155551212 via 192.168.1.250 queued successfully.
    MessageID: 14

Lines containing “SMSreceived” and “SMSverify” will show the progress in parsing any acknowledgement SMS messages received by the smsack.cgi script:


2009/11/12 09:15:41 SMSreceived:
    username=nagios&password=secret&XMLDATA=
    <?xml version="1.0" encoding="ISO-8859-1"?>
    <Message Notification>
      <SenderNumber>14155551212</SenderNumber>
      <Message>
        ACK PROBLEM HostAlert mailserv2 192.168.1.250
        /AllServices is DOWN
        /11-12-2009 09:11:46/ CRITICAL -
        Host Unreachable (192.168.1.250)
      </Message>
      <Date>09/11/12</Date>
      <Time>09:15:36</Time>
    </Message Notification>

2009/11/12 09:15:41 SMSverify
    status = ACKed - ACCEPTED:
    From=14155551212 Received=09/11/12 09:15:36
    Status=ACK Host=mailserv2 Service=AllServices
    MSG="ACK PROBLEM ... Host Unreachable (192.168.1.250)"

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix