SMART (Smart Monitoring and Rebooting Tool)
SMART is an easy-to-install application (simply copying the program), is much simpler to configure than Nagios (adding a new element to monitor involves adding only one line in the configuration file), and SMART is flexible, allowing you to monitor any service or aspect of the system, and it is very effective.
Our experience in a production environment with thousands of users tells us that it's inevitable that we will reach some peak periods in which the amount of requests received by a service goes beyond the capabilities of the system, and response time grows in a dramatic manner. The fact that the system detects this situation, before its own administrator, and solves it in five minutes, is a great problem solver and provides a perception of better service to users.
After two years of running SMART on about 15 servers, we can say that its main contribution has been our peace of mind. It's wonderful having a colleague who is checking that everything works correctly 24/7 and who informs you about troubles after they already have been solved (especially during the weekends).
SMART was created, developed, tested and enjoyed in the IT Department of the Universitat Internacional de Catalunya. Vicente Sangrador and Jordi Xavier Prat have collaborated on this project and encouraged me to write this article.
Resources for this article: /article/9268.
Albert Martorell is a Telecommunications Engineer and has been working as a network and “penguins” administrator in the IT Department of the Universitat Internacional de Catalunya since 1998.
- The Ubuntu Conspiracy
- A First Look at IBM's New Linux Servers
- Vigilante Malware
- Disney's Linux Light Bulbs (Not a "Luxo Jr." Reboot)
- Libreboot on an X60, Part I: the Setup
- System Status as SMS Text Messages
- Vagrant Simplified
- Dealing with Boundary Issues
- Bluetooth Hacks
- Non-Linux FOSS: Code Your Way To Victory!