A System Monitoring Dashboard
For about a year, my company had been struggling to roll out a monitoring solution. False positives and inaccurate after-hours pages were affecting morale and wasting system administrators' time. After speaking to some colleagues about what we really need to monitor, it came down to a few things:
Web servers—by way of HTTP, not only physical servers.
Disk space.
SMTP servers' availability—by way of SMTP, not only physical servers.
A history of these events to diagnose and pinpoint problems.
This article explains the process I developed and how I set up disk, Web and SMTP monitoring both quickly and simply. Keeping the monitoring process simple meant that all the tools used should be available on a recent Linux distribution and should not use advanced protocols, such as SNMP or database technology. As a result, all of my scripts use the Bash shell, basic HTML, some modest Perl and the wget utility. All of these monitoring scripts share the same general skeleton and installation steps, and they are available from the Linux Journal FTP site (see the on-line Resources).
Installing the scripts involves several steps. Start by copying the script to a Web server and making it world-executable with chmod. Then, create a directory under the root of your Web server where the script can write its logs and history. I used webmon for monitor_web.sh. The other scripts are similar: I used smtpmon for monitor_smtp.sh and stats for monitor_stats.pl. monitor_disk.sh is different from the others because it is the only one installed locally on each server you want to monitor.
Next, schedule the scripts in cron. You can run each script with any user capable of running wget, df -k and top. The user also needs to have the ability to write to the script's home. I suggest creating a local user called monitor and scheduling these through that user's crontab. Finally, install wget if it is not already present on your Linux distribution.
My first challenge was to monitor the Web servers by way of HTTP, so I chose wget as the engine and scripted around it. The resulting script is monitor_web.sh. For those unfamiliar with wget, its author describes it as “a free software package for retrieving files using HTTP, HTTPS and FTP, the most widely used Internet protocols” (see Resources).
After installation, monitor_web.sh requires only two choices for the user, e-mail recipient and URLs to monitor, which are labeled clearly. The URLs must conform to HTTP standards and return a valid http 200 OK string to work. They can be HTTP or HTTPS, as wget and monitor_web.sh support both. Once installed and run the first time, the user is able to get to http://localhost/webmon/webmon.html and view the URLs, the last result and the history in a Web browser, as they all are links.
Now, let's break down the script; see monitor_web.sh, available on the LJ FTP site. First, I set all the variables for system utilities and the wget program. These may change on your system. Next, we make sure we are on the network. This ensures that if the server monitoring the URLs goes off-line, a massive number of alerts are not queued up by Sendmail until the server is back on-line.
As I loop through all the URLs, I have wget connect two times with a timeout of five seconds. I do this twice to reduce false positives. If the Web site is down, the script generates an e-mail message for the recipient and updates the Web page. Mail also is sent when the site is back up. The script sends only one message, so we don't overwhelm the recipient. This is achieved with the following code:
wget $URL 2>> $WLOG
if (( $? != 0 ));then
echo \<A HREF\="$URL"\>$URL\<\/A\> is down\
$RTAG $EF.\
$LINK Last Result $LTAG >> $WPAGE
if [[ ! -a down.$ENV ]];then
touch /tmp/down.$ENV
mail_alert down
else
echo Alert already sent for $ENV \
- waiting | tee -a $WLOG
fi
fi
I have included the HTML for green and red text in the script, if you choose not to use graphics. Again, the full script is available from the Linux Journal FTP site.

Figure 1. monitor_web.sh in action. Run the script from cron to regenerate this page as often as needed.
With the Web servers taken care of, it was time to tackle disk monitoring. True to our keep-it-simple philosophy, I chose to create a script that would run from cron and alert my team based on the output of df -k. The result was monitor_disk.sh. The first real block of code in the script sets up the filesystems list:
FILESYSTEMS=$(mount | grep -iv proc |\
grep -iv cdrom | awk'{print $3}')
I ignore proc and am careful not to report on the CD-ROM, should my teammates put a disk in the drive. The script then compares the value of Use% to two values, THRESHOLD_MAX and THRESHOLD_WARN. If Use% exceeds either one, the script generates an e-mail to the appropriate recipient, RECIPIENT_MAX or RECIPIENT_WARN. Notice that I made sure the Use% value for each filesystem is interpreted as an integer with this line:
typeset -i UTILIZED=$(df -k $FS | tail -1 | \
awk '{print $5}' | cut -d"%" -f1)
A mailing list was set up with my team members' e-mail addresses and the e-mail address of the on-call person to receive the critical e-mails and pages. You may need to do the same with your mail server software, or you simply can use your group or pager as both addresses.
Because our filesystems tend to be large, about 72GB–140GB, I have set critical alerts to 95%, so we still have some time to address issues when alerted. You can set your own threshold with the THRESHOLD_MAX and THRESHOLD_WARN variables. Also, our database servers run some disk-intensive jobs and can generate large amounts of archive log files, so I figured every 15 minutes is a good frequency at which to monitor. For servers with less active filesystems, once an hour is enough.
Our third script, monitor_smtp.sh, monitors our SMTP servers' ability to send mail. It is similar to the first two scripts and simply was a matter of finding a way to connect directly to a user-defined SMTP server so I could loop through a server list and send a piece of mail. This is where smtp.pl comes in. It is a Perl script (Listing 1) that uses the NET::SMTP module to send mail to an SMTP address. Most recent distributions have this module installed already (see the Do I Have That Perl Module Installed sidebar). Monitor_smtp.sh updates the defined Web page based on the success of the transmission carried out by smtp.pl. No attempt is made to alert our group, as this is a trouble-shooting tool and ironically cannot rely on SMTP to send mail if a server is down. Future versions of monitor_smtp.sh may include a round-robin feature and be able to send an alert through a known working SMTP server.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- RSS Feeds
- New Products
- Using Salt Stack and Vagrant for Drupal Development
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- Validate an E-Mail Address with PHP, the Right Way
- New Products
- Tech Tip: Really Simple HTTP Server with Python
- Ahh, the Koolaid.
3 hours 22 min ago - git-annex assistant
9 hours 22 min ago - direct cable connection
9 hours 44 min ago - Agreed on AirDroid. With my
9 hours 55 min ago - I just learned this
9 hours 59 min ago - enterprise
10 hours 29 min ago - not living upto the mobile revolution
13 hours 20 min ago - Deceptive Advertising and
13 hours 56 min ago - Let\'s declare that you have
13 hours 57 min ago - Alterations in Contest Due
13 hours 58 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.




Comments
Nagios
I believe Nagios already does this.
Also checking the page ...
We have also scripts for monitoring a web server. Additionally to know that the web service is up and running we were interested in knowing that the page haven't had changes, so we did this:
- Precalculated and stored de md5 from the page we want to check
- Every few minutes (crontab line), get the page and calculate the md5 for the page we get and compare it with the precalculated. If they are different, there is a problem.
The calculus is an onliner:
fetch -q -o - http://www.thepage.tocheck | md5
Also, to avoid the problem of installing some modulus in Perl to send email, we use this:
sub send_mail {
# Send email messages using the sendmail process
# parameters:
# $from_address, addres to use in From field
# $subject, Subject for the message
# $body, Message body
# $to_address, address(es) destination for the message
my ( $from_address, $subject, $body, $to_address) = @_;
open(SENDMAIL, "|/usr/lib/sendmail -t") or warn "Cannont open sendmail: $!\n";
print SENDMAIL <<MESSAGE;
To: $to_address
From: $from_address
Subject: $subject
$body
.
MESSAGE
close SENDMAIL;
}
So, not having the module is not a problem for sending emails :)
Sendmail
How would you include you sendmail script in the web_monitor.sh ?
script errors
I get a ": bad interpreter" error when running monitor_web.sh after having configured web servers, email addy and correcting the path to wget. Please advise.
line endings
I had to fix the line endings of the script after downloading it to Windows XP and moving the tgz file to my Debian box. Try
tr -s "\r" "\n" < monitor_web.sh > b && mv b monitor_web.sh;
You'll have to re-enable the executable bit.
bash or ksh ?
is the first line pointing to bash or ksh?
make sure its bash if using linux as most distros don't have ksh by default.
permissions ok?
have you set perms to 755 on the script?