A System Monitoring Dashboard

This simple set of shell scripts keeps you informed about disks that are filling up, CPU-hog processes and problems with the Web and mail servers.

Finally, we come to our stats script, monitor_stats.pl. This script logs in to each host and runs the commands:


df -k
swapon -s
top -n 1 | head -n 20
hostname
uptime

It then displays the results in a browser (Figure 2) and saves the result in a log, again sorted by date on the filesystem. It serves as a simple dashboard to give quick stats on each server.

The benefit of this monitoring design is threefold:

  1. We have a history of CPU, disk and swap usage, and we easily can pinpoint where problems may have occurred.

  2. Tedious typing to extract this information for each server is reduced. This comes in handy before leaving work to resolve potential problems before getting paged at night.

  3. Management quickly can see how well we're doing.

We are using the insecure rsh protocol in this script to show you how to get this set up quickly, but we recommend that you use SSH with properly distributed keys to gain security.

Figure 2. monitor_stats.pl in Action

Conclusion

With the use of this new system monitoring dashboard, my team's productivity has increased and and its confidence in monitoring has soared, because we no longer are wasting time chasing down false positives. A history of system performance has been a real time saver in diagnosing problems. Finally, easy installation allows users with basic skills to conquer a complex system administration problem in one business day.

Resources for this article: /article/8269.

John Ouellette is a system administrator with nine years of experience in Microsoft Windows NT and UNIX. He believes the command line is king and loves chicken parmigiana. He can be reached at john_ouellette@yahoo.com.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Nagios

Will_'s picture

I believe Nagios already does this.

Also checking the page ...

ecazarez's picture

We have also scripts for monitoring a web server. Additionally to know that the web service is up and running we were interested in knowing that the page haven't had changes, so we did this:

- Precalculated and stored de md5 from the page we want to check
- Every few minutes (crontab line), get the page and calculate the md5 for the page we get and compare it with the precalculated. If they are different, there is a problem.

The calculus is an onliner:


fetch -q -o - http://www.thepage.tocheck | md5

Also, to avoid the problem of installing some modulus in Perl to send email, we use this:


sub send_mail {
# Send email messages using the sendmail process
# parameters:
# $from_address, addres to use in From field
# $subject, Subject for the message
# $body, Message body
# $to_address, address(es) destination for the message
my ( $from_address, $subject, $body, $to_address) = @_;
open(SENDMAIL, "|/usr/lib/sendmail -t") or warn "Cannont open sendmail: $!\n";
print SENDMAIL <<MESSAGE;
To: $to_address
From: $from_address
Subject: $subject
$body
.

MESSAGE
close SENDMAIL;
}

So, not having the module is not a problem for sending emails :)

Sendmail

jcoyle's picture

How would you include you sendmail script in the web_monitor.sh ?

script errors

GZILL's picture

I get a ": bad interpreter" error when running monitor_web.sh after having configured web servers, email addy and correcting the path to wget. Please advise.

line endings

Russ's picture

I had to fix the line endings of the script after downloading it to Windows XP and moving the tgz file to my Debian box. Try

tr -s "\r" "\n" < monitor_web.sh > b && mv b monitor_web.sh;

You'll have to re-enable the executable bit.

bash or ksh ?

Anonymous's picture

is the first line pointing to bash or ksh?
make sure its bash if using linux as most distros don't have ksh by default.

permissions ok?

Anonymous's picture

have you set perms to 755 on the script?

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix