Managing Linux Using Puppet

Removing Rules

When defining rules in Puppet, it is important to keep in mind that removing a rule for a resource is not the same as a rule that removes that resource. For example, suppose you have a rule that creates an authorized SSH key for "developerA". Later, "developerA" leaves, so you remove the rule defining the key. Unfortunately, this does not remove the entry from authorized_keys. In most cases, the state defined in Puppet resources is not considered definitive; changes outside Puppet are allowed. So once the rule for developerA's key has been removed, there is no way to know if it simply was added manually or if Puppet should remove it.

In this case, you can use the ensure => 'absent' rule to ensure packages, files, directories, users and so on are deleted. The original Listing 1 showed an example of this to remove the emacs package. There is a definite difference between ensuring that emacs is absent versus no rule declaration.

At our office, when a developer or administrator leaves, we replace their SSH key with an invalid key, which then immediately updates every entry for that developer.

Existing Modules

Many modules are listed on Puppet Forge covering almost every imaginable problem. Some are really good, and others are less so. It's always worth searching to see if there is something good and then making a decision as to whether it's better to define your own module or reuse an existing one.

Managing Git

We don't keep all of our machines sitting on the master branch. We use a modified gitflow approach to manage our repository. Each server has its own branch, and most of them point at master. A few are on the bleeding edge of the develop branch. Periodically, we roll a new release from develop into master and then move each machine's branch forward from the old release to the new one. Keeping separate branches for each server gives flexibility to hold specific servers back and ensures that changes aren't rolled out to servers in an ad hoc fashion.

We use scripts to manage all our branches and fast-forward them to new releases. With roughly 100 machines, it works for us. On a larger scale, separate branches for each server probably is impractical.

Using a single repository shared with all servers isn't ideal. Storing sensitive information encrypted in Hiera is a good idea. There was an excellent Linux Journal article covering this: "Using Hiera with Puppet" by Scott Lackey in the March 2015 issue.

As your number of machines grows, using a single git repository could become a problem. The main problem for us is there is a lot of "commit noise" between reusable modules versus machine-specific configurations. Second, you may not want all your admins to be able to edit all the modules or machine manifests, or you may not want all manifests rolled out to each machine. Our solution is to use multiple repositories, one for generic modules, one for machine-/customer-specific configuration and one for global information. This keeps our core modules separated and under proper release management while also allowing us to release critical global changes easily.

Scaling Up/Trade-offs

The approach outlined in this article works well for us. I hope it works for you as well; however, you may want to consider some additional points.

As our servers differ in ways that are not consistent, using Facter or metadata to drive configuration isn't suitable for us. However, if you have 100 Web servers, using the hostname of nginx-prod-099 to determine the install requirements would save a lot of time.

A lot of people use the Puppet master to roll out and push changes, and this is the general approach referred to in a lot of tutorials on-line. You can combine this with PuppetDB to share information from one machine to another machine—for example, the public key of one server can be shared to another server.


This article has barely scratched the surface of what can be done using Puppet. Virtually everything about your machines can be managed using the various Puppet built-in resources or modules. After using it for a short while, you'll experience the ease of building a second server with a few commands or of rolling out a change to many servers in minutes.

Once you can make changes across servers so easily, it becomes much more rewarding to build things as well as possible. For example, monitoring your cron jobs and backups can take a lot more work than the actual task itself, but with configuration management, you can build a reusable module and then use it for everything.

For me, Puppet has transformed system administration from a chore into a rewarding activity because of the huge leverage you get. Give it a go; once you do, you'll never go back!


David Barton is Managing Director of OneIT, a company specializing in custom business software development. He's been using Linux since 1998 and managing OneIT's Linux servers for more than 10 years.