Puppet and Nagios: a Roadmap to Advanced Configuration
Puppet has provided baked-in Nagios support for a long time now. When combined with Exported Resources, Puppet is well suited to manage an intelligent Nagios configuration where nodes are automatically inventoried and monitored. The excellent Pro Puppet, written by James Turnbull, provides a fairly complete rundown of the installation and configuration steps needed in order to progress in this direction, so I won't repeat the information here. Instead, this article highlights some less-than-optimal default behavior of the Nagios types and details my solution that results in a cleaner filesystem and improved performance.
Not All Resources Should Be Exported!
This took me an embarrassingly long time to figure out. Just like resources that are defined in a manifest, Exported Resources must be unique. For example, suppose we have nodes foo and bar, which we'd like to categorize into a Nagios hostgroup named "PMF". At first glance, adding the following code to foo's manifest might seem like the way to go:
@@nagios_hostgroup { "PMF":
ensure => present,
hostgroup_members +> [ $::hostname ]
}
In theory, the resource will be exported to the database when the first node compiles its manifest, but the next node's compilation will complain with a duplicate resource error. For this reason, we will avoid exporting resources created by this particular type. Instead, we will manage our hostgroup memberships via the hostgroup parameter of the nagios_host type.
Had it not been for Pieter Barrezeele's blog (http://pieter.barrezeele.be/2009/05/11/puppet-and-nagios), I may have ended up settling for Puppet's fairly inefficient approach to storing resources managed via its Nagios types. By default, these bits are maintained in hard-coded file paths according to type used. For example, all resources based on the nagios_service type are collected and stored in /etc/nagios/nagios_service.cfg and so on. For performance reasons, I want each collected resource to be stored in its own file path based on the following naming convention:
<base_directory>/<type>_<h3>_<hostname>.cfg
Furthermore, I want my filenames to be composed of all lowercase letters and spaces replaced with underscores. For starters, let's add the bare minimum snippets of code into our manifests in order to export and collect resources using the nagios_host type (Listings 1 and 2).
Listing 1. modules/nagios/manifests/init.pp
# This class will be used by the nagios server
class nagios {
service { nagios:
ensure => running,
enable => true,
}
# Be sure to include this directory in your nagios.cfg
# with the cfg_dir directive
file { resource-d:
path => '/etc/nagios/resource.d',
ensure => directory,
owner => 'nagios',
}
# Collect the nagios_host resources
Nagios_host <<||>> {
require => File[resource-d],
notify => Service[nagios],
}
}
Listing 2. /modules/nagios/manifests/export.pp
# All agents (including the nagios server) will use this
class nagios::export {
@@nagios_host { $::hostname:
address => $::ipaddress,
check_command => 'check_host_alive!3000.0,80%!5000.0,100%!10',
target => "/etc/nagios/resource.d/host_${::hostname}.cfg",
}
}
Note:
Due to the inherent space limitations of published articles, all code will be kept as minimal as possible while conforming to the structure of Puppet Modules. However, no attempt will be made to reproduce a complete module capable of managing a Nagios instance. Instead, I focus on the concepts that have been defined in this article's introduction. Please see http://docs.puppetlabs.com if you need an introduction to Puppet modules.
Let's examine the good and the not-so-good aspects of what we've defined up to this point. On the positive side, all agents will export a nagios_host resource. The Nagios server, upon compiling its manifest, will collect each resource, store it in a unique file, and refresh the Nagios service. At first glance, it may seem like our work is done. Unfortunately, our solution is littered with the following issues and shortcomings:
-
Nagios will not be able to read the newly created .cfg files since the Puppet Agent will create them while running as the root user.
-
There is too much "coordination" needed with the target parameter of the nagios_host type. We should not have to work so hard in order to ensure our target points to the correct file and is void of unpleasant things like spaces and/or mixed case.
-
The address parameter is hard-coded with the value of the ipaddress fact. Although this may be acceptable in some environments, we really should allow for greater flexibility.
-
No ability exists to leverage Nagios hostgroups.
-
Puppet will be unable to purge our exported resources, because we are not using the default behavior of the target parameter.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- Developer Poll
- Dart: a New Web Programming Experience
- May 2013 Issue of Linux Journal: Raspberry Pi
- What's the tweeting protocol?
- Reply to comment | Linux Journal
3 hours 1 min ago - Reply to comment | Linux Journal
3 hours 48 min ago - Web Hosting IQ
5 hours 22 min ago - Thanks for taking the time to
6 hours 59 min ago - Linux is good
8 hours 57 min ago - Reply to comment | Linux Journal
9 hours 14 min ago - Web Hosting IQ
9 hours 44 min ago - Web Hosting IQ
9 hours 44 min ago - Web Hosting IQ
9 hours 45 min ago - Reply to comment | Linux Journal
12 hours 46 min ago



Comments
Re:
purge logic is cumbersome... set recurse, purge, force on resource dir, and expire your nodes in puppet properly (puppet node clean/deactivate foo) and you'll achieve the same ACHO
file ownership is not a problem
"Before we begin, let's make sure we understand the most important problem—the issue of file ownership and permissions for the newly generated .cfg files. Because these files are created via the target parameter of each associated Nagios type, they'll be written to disk by the user Puppet runs as. This means they will be owned by the root user/group, and Nagios will not have permission to read them (because I know you are not running Nagios as root, correct?)."
I don't get this.
On a default Ubuntu 12.04 machine, Nagios3 runs as user nagios. All .cfg files are are owned by root. Nagios is just fine with that. Doesn't even complain in the logs about it.
Even if it weren't, couldn't I just chown nagios:nagios /etc/nagios3/conf.d and chmod g+s /etc/nagios3/conf.d? This would ensure all newly created files in /etc/nagios3/conf.d/ were owned by the nagios group, of which user nagios is a member.
I don't understand how the filepermissions are the 'most important problem' in this.
Turns out it's not so much
Turns out it's not so much the ownership that is the problem, but permissions. Puppet creates the files with a 0600 mode, making it unreadable for Nagios.
purge logic is cumbersome...
purge logic is cumbersome... set recurse, purge, force on resource dir, and expire your nodes in puppet properly (puppet node clean/deactivate foo) and you'll achieve the same.
Reply to comment | Linux Journal
I am no longer positive the place you are getting your information, but great topic.
I must spend some time finding out more or figuring out
more. Thanks for great info I used to be in search of this information for
my mission.
Reply to comment | Linux Journal
Thаnks fοr fіnally wrіtіng about >
Reply to comment | Linux Journal < Liked it!