Using an SMS Server to Provide a Robust Alerting Service for Nagios

How to implement a Nagios-to-SMS service.
Ordering the iSMS

The product is generally available, and I simply used a price comparison site to find the cheapest one, as I didn't feel I needed support from the vendor. MultiTech made several changes to its product while I was in the midst of writing this article. These changes included renaming the product, updating the firmware version and lowering the price. The iSMS previously was named SMSFinder, and you will see this reflected in the name of the Perl script and in other places. The firmware update required some changes in the Perl code. The original product was priced around $700, but it's now available in the $400 range. This article describes the more recent version of the iSMS and Perl script.

Ordering SMS Service

I did not shop around between the different carriers for SMS service, as my company already had a corporate account with AT&T. I initially tried to walk into an AT&T retail store to buy service for the iSMS, but I was unable to purchase a service package that did not include voice. I ended up doing all the ordering and setup over the phone with AT&T corporate. I was able to get the SIM card at the retail store, which saved me from having to wait for the card to be mailed to my location. AT&T calls its text-only SMS service telemetry. It may make the ordering process easier if you use this terminology with your carrier.

Once you reach the correct ordering department, all you should need to do is read them two numbers: the first identifies the iSMS, and the second identifies the SIM card. The number for the iSMS is the International Mobile Equipment Identity (IMEI) number printed on the iSMS chassis label. The second number, the Integrated Circuit Card ID (ICC-ID) is printed on the SIM card. Once I had communicated these numbers to the carrier, I was able to establish the service and send a test message within a few minutes. Make sure you make note of the subscriber number given to you, as this will be the source of SMS alerts and the destination for your “ACK” and “OK” responses. It is handy to associate a contact name with this number for caller-ID purposes on your mobile phone (for example, “Nagios”). With the service I purchased, the one-time setup fee was $18, and the monthly charge is about $9, depending on usage.

Physical Location of iSMS

I wanted the iSMS and the Nagios server to be able to send messages for as long as possible if there were a problem with network connectivity or power. In my situation, this meant locating the iSMS, the Nagios server and their shared Ethernet switch all in the same computer room and plugged in to the same redundant UPS. Of course, I could eliminate the switch from this configuration by using an Ethernet crossover cable to link the iSMS and the Nagios server directly, but that would limit communication to the one server. It also would eliminate most of the advantages over the hard-wired GSM modems that I was trying to improve upon.

The smsfinder Perl Script

The Perl script can be found on the MonitoringExchange site (www.monitoringexchange.org) or the Nagios Wiki (www.nagioswiki.org). Searching for “smsfinder” should get you to the correct place. The documentation for the script includes installation instructions, example Nagios configuration files and screenshots of the iSMS Web user interface. The author used an interesting approach to create a single script that serves three different purposes. The script checks to see which filename was used to call the script and then performs a completely different function depending on which name was used. The script has three names:

  • smssend.pl: a Nagios “command” used to send messages about hosts and services via SMS.

  • smsack.cgi: a CGI script used by the iSMS to acknowledge alerts it has received via an SMS message sent by a mobile phone.

  • check_smsfinder.pl: a typical Nagios “plugin” or “check” script invoked by Nagios on a scheduled basis to monitor the health of the iSMS device itself.

Script Installation

I stored the actual script as /usr/local/nagios/smsack/smsack.cgi and made two symbolic links to it with the following names/paths:

  • /usr/local/nagios/smsack/sendsms.pl

  • /usr/local/nagios/libexec/check_smsfinder.pl

Apache will want to execute an actual file as a CGI, while Nagios will not care about the symbolic links. Figure 2 provides an overview of how the script interacts with the other components of the system.

Figure 2. SMSFinder Method of Operation

The least complex use of the script is when it is called check_smsfinder.pl. Nagios runs this check script at scheduled intervals, and the script queries a status page on the iSMS via HTTP to make sure it is running okay. The script returns the exit status back to Nagios. The performance data includes the signal level for the GSM modem, the model number and the firmware version.

The next use of the script is when it is called as sendsms.pl by Nagios. In this form, it is used to send host and service alerts and acknowledgements to the iSMS for delivery to mobile users. The script uses the “HTTP Send API” to request that it transmit the SMS message. The iSMS queues the request and returns a message ID. The end user can query the iSMS with the message ID to find out if the SMS message was sent successfully or failed for some reason. When invoked as sendsms.pl, the script has a --noma option, which will query the iSMS after queuing the message to get its status. This takes slightly longer to execute, as the script has to wait for confirmation that the message actually was sent (or failed). The documentation refers to “NoMa” but does not explain why the option was named that way.

The most complicated use of the script is when it runs as the smsack.cgi CGI script under Apache. The recipient of an SMS alert can send the entire message back to the iSMS with the string “ACK” or “OK” prepended to the message text. When this SMS message is received by the iSMS, it uses the “HTTP Receive API” to call the smsack.cgi CGI script with the ACK message. The smsack.cgi CGI script parses the message text, determines whether it is a host or service being acknowledged, verifies that the sender's phone number is in the Nagios object cache and then uses the Nagios “external command” interface to signal Nagios. Nagios then tries to match the host or service name with one it knows about, and if this match is successful, it acknowledges the problem. The script also creates a note on the host or service page indicating that the problem was acknowledged by the sender's mobile number.

The acknowledge feature expects the entire SMS alert message to be sent back to the iSMS with the “ACK” text prepended. I found the best way to do this on both older text-based and newer graphical mobile phones was to “forward” the entire message received from Nagios back to the Nagios phone number and then insert the ACK text at the beginning of the message.

Figure 3. Apple iPhone Screenshot

Figure 3 shows a screenshot from an Apple iPhone. The initial “PROBLEM” alert received from the iSMS is at the top (shown in gray). The message forwarded back to the iSMS with the prepended “ACK” is in the middle (shown in green), and the receipt of the acknowledgment sent from the iSMS is at the bottom (shown in gray). This entire transaction can be accomplished in less than a minute.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix