Large-Scale Linux Configuration Management
Before the introduction of LCFG, we were configuring machines using a typical range of techniques, including vendor installs and disk copying (cloning). These were followed by the application of a monolithic script which applied assorted “tweaks” for all the different configuration variations. This met with virtually none of the requirements listed above and was a nightmare to manage.
The available alternatives ranged from large commercial systems (too expensive and probably too inflexible) to systems developed at individual sites for their own use (often not much of an improvement over our existing process). More recently, interesting tools such as COAS and the GNU cfengine (see Resources 5) have appeared, but we are still not aware of any comparable system which addresses quite the same set of requirements as LCFG.
Given limited development resources, we attempted to design an initial system as a number of independent subsystems, intending to use temporary implementations for some of the ones where we could leverage existing technology:
Resource Repository: design a standard syntax for representing resources (individual configuration parameters). These would be stored in a central place where they could be analysed and processed as well as distributed to individual machines.
Resource Compiler: preprocess the resources so that we could create configurations by inheritance and avoid specifying large numbers of low-level resources explicitly.
Distribution Mechanism: distribute the master copy of the resources to clients on demand in a robust way.
Component Framework: provide a framework which allows components to be easily written for configuring new subsystems and services, using the resources from the repository.
Core Components: implement a number of core components, including basic OS installation and the standard daemons. We wanted some of these to act as exemplars to make it as easy as possible for other people to create new components.
Items of configuration data are represented as key,value pairs, in a way similar to X resources. The key consists of three parts: the hostname, the component and the attribute. For example, the nameserver (cul) for the host wyrgly is configured by the DNS component:
wyrgly.dns.servers: cul.dcs.ed.ac.uk
Notice that this specification is a rather abstract representation, not directly tied to the form in which the configuration is actually required by the machine, in this case, as a line in the resolv.conf file. This allows the same representation to be used for different platforms, and it permits high-level programs to analyse and generate the resources easily . The LCFG components on each machine are responsible for translating these resources into the appropriate form for the particular platform. COAS uses a similar representation for configuration parameters.
The resources are currently stored in simple text files, with one file per host. This collection of files forms the repository. We intend to provide a special-purpose language for specifying these resources; it would support inheritance, default configurations, validation and some concept of higher-level specifications. However, we are currently using a “temporary” solution based on the C preprocessor, followed by a short Perl script to preprocess the resources. The C preprocessor provides file inclusion and macros, which can be used for primitive inheritance. The Perl script allows inherited resources to be modified with regular expressions. Wild cards are also supported to provide default values.
In practice, most machines have very short resource files which simply inherit some standard templates. Machines can be cloned simply by copying these resource files. Often, a few resources are overridden to provide slight variations. For example:
#include <generic_client.h> #include <linux.h> #include <portable.h> amd.localhome: paul auth.users: paul
The name of the host is not necessary in the resource keys, because this is generated from the name of the resource file.
Resources are currently distributed to clients using NIS (Sun's Network Information System). This is another “temporary” solution which is far from ideal; we hope to replace it in the near future.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- RSS Feeds
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- What's the tweeting protocol?
- Tech Tip: Really Simple HTTP Server with Python




1 hour 47 min ago
6 hours 14 min ago
9 hours 49 min ago
10 hours 22 min ago
12 hours 45 min ago
12 hours 48 min ago
12 hours 50 min ago
17 hours 15 min ago
19 hours 6 min ago
1 day 19 min ago