Using Hiera with Puppet

With Hiera, you can externalize your systems' configuration data and easily understand how those values are assigned to your servers. With that data separated from your Puppet code, you then can encrypt sensitive values, such as passwords and keys.

Separating code and data can be tricky. In the case of configuration management, there is significant value in being able to design a hierarchy of data—especially one with the ability to cascade through classifications of servers and assign one or several options. This is the primary value that Hiera provides—the ability to separate the code for "how to configure the /etc/ntp.conf" from the values that define "what ntp servers should each node use". In the most concise sense, Hiera lets you separate the "how" from the "what".

The idea behind separating code and data is more than just having a cleaner Puppet environment; it allows engineers to create more re-usable Puppet modules. It also puts your variables in one place so that they too can be re-used, without importing manifests across modules. Hiera's use cases include managing packages and versions or using it as a Node Classifier. One of the most compelling use cases for Hiera is for encrypting credentials and other sensitive data, which I talk about later in this article.

Puppet node data originally was managed through node inheritance, which is no longer supported, and subsequently through using a params.pp module subclass. Before Hiera, it was necessary to modify the params.pp module class locally within the module, which frequently damaged the re-usability of the module. params.pp still is used in modules today, but as of Puppet version 3, Hiera is not only the default, but also the first place checked for variable values. When a variable is defined in both Hiera and a module, Hiera takes precedence by default. As you'll see, it's easy to use a module with params.pp and store some or all of the variable data in Hiera, making it easy to migrate incrementally.

To get started using Hiera with your existing Puppet 3 implementation, you won't have to make any significant changes or code migrations. You need only a hierarchy file for Hiera and a yaml file with a key/value pair. Here is an example of a Hiera hierarchy:


hiera.yaml:

:backends:
    - yaml
:yaml:
  :datadir: /etc/puppet/hieradata
:hierarchy:
  - "node/%{::fqdn}"
  - "environment/%{::env}/main"
  - "environment/%{::env}/%{calling_module}"
  - defaults

And a yaml file:


/etc/puppet/hieradata/environment/prod/main.yaml:
---
$nginx::credentials::basic_auth: 'password'

Hiera can have multiple back ends, but for now, let's start with yaml, which is the default and requires no additional software. The :datadir: is just the path to where the hierarchy search path should begin, and is usually a place within your Puppet configuration. The :hierarchy: section is where the core algorithm of how Hiera does its key/value lookups is defined. The :hierarchy: is something that will grow and change over time, and it may become much more complex than this example.

Within each of the paths defined in the :hierarchy:, you can reference any Puppet variable, even $operatingsystem and $ipaddress, if set. Using the %{variable} syntax will pull the value.

This example is actually a special hierarchical design that I use and recommend, which employs a fact assigned to all nodes called @env from within facter. This @env value can be set on the hosts either based on FQDN or tags in EC2 or elsewhere, but the important thing is that this is the separation of one large main.yaml file into directories named prod, dev and so on, and, therefore, the initial separation of Hiera values into categories.

The second component of this specific example is a special Hiera variable called %{calling_module}. This variable is unique and reserved for Hiera to indicate that the yaml filename to search will be the same as the Puppet module that is performing the Hiera lookup. Therefore, the way this hierarchy will behave when looking for a variable in Puppet is like:


$nginx::credentials::basic_auth 

First, Hiera knows that it's looking in /etc/puppet/hieradata/node for a file named <hostname.domain.tld>.yaml and for a value for nginx::credentials::basic_auth. If either the file or the variable isn't there, the next step is to look in /etc/puppet/hieradata/environment/<prod|stage|dev>/main.yaml, which is a great way to have one yaml file with most of your Hiera values. If you have a lot of values for the nginx example and you want to separate them for manageability, you simply can move them to the /etc/puppet/hieradata/environment/<prod|stage|dev>/nginx.yaml file. Finally, as a default, Hiera will check for the value in defaults.yaml at the top of the hieradata directory.

Your Puppet manifest for this lookup should look something like this:


modules/nginx/manifests/credentials.pp


class nginx::credentials (
  basic_auth = 'some_default',
){}

This class, when included, will pull the value from Hiera and can be used whenever included in your manifests. The value set here of some_default is just a placeholder; Hiera will override anything set in a parameterized class. In fact, if you have a class you are thinking about converting to pull data from Hiera, just start by moving one variable from the class definition in {} to a parameterized section in (), and Puppet will perform a Hiera lookup on that variable. You even can leave the existing definition intact, because Hiera will override it. This kind of Hiera lookup is called Automatic Parameter Lookup and is one of several ways to pull data from Hiera, but it's by far the most common in practice. You also can specify a Hiera lookup with:


modules/nginx/manifests/credentials.pp


class nginx::credentials (
  basic_auth = hiera('nginx::credentials::basic_auth'),
){}

These will both default to a priority lookup method in the Hiera data files. This means that Hiera will return the value of the first match and stop looking further. This is usually the only behavior you want, and it's a reasonable default. There are two lookup methods worth mentioning: hiera_array and hiera_hash. hiera_array will find all of the matching values in the files of the hierarchy and combine them in an array. In the example hierarchy, this would enable you to look up all values for a single key for both the node and the environment—for example, adding an additional DNS search path for one host's /etc/resolv.conf. To use a hiera_array lookup, you must define the lookup type explicitly (instead of relying on Automatic Parameter Lookup):


modules/nginx/manifests/credentials.pp


class nginx::credentials (
  basic_auth = hiera_array('nginx::credentials::basic_auth'),
){}

A hiera_hash lookup works in the same way, only it gathers all matching values into a single hash and returns that hash. This is often useful for an advanced create_resources variable import as well as many other uses in an advanced Puppet environment.

______________________