GAR: Automating Entire OS Builds
I'm a member of the LNX-BBC Project. The LNX-BBC is a business card-sized CD-ROM containing a miniature distribution of GNU/Linux. In order to fit everything we wanted on the roughly 50MB volume, we had to build every single piece of software from scratch and weed out unnecessary files.
The first few LNX-BBC releases were done entirely by hand, pulling precompiled binaries from existing distributions and hand compiling some 200 software packages. While ultimately successful, it made the process of upgrading individual packages difficult. It was also impossible to keep our work in any sort of revision-control system.
User requests for source code revealed another problem with our development process. We had written some scripts to automate the creation of the compressed loopback filesystem and the El-Torito bootable ISO image, but the really difficult work was the compilation and installation of all the packages. We had no way of giving people a single tarball for building a BBC from scratch.
What we needed was a system for automating the compilation and installation of all of the third-party software. It needed to allow us to store our customizations in CVS and provide a simple mechanism for end users to build their own LNX-BBC ISO images.
These requirements had been noticed and met before. In 1994, Jordan Hubbard began work on the BSD Ports system. The FreeBSD operating system includes a great many programs and utilities, but it is not complete without a number of third-party programs. The BSD Ports system manages the compilation and installation of third-party software that has been ported to BSD.
Often when one asks for help with FreeBSD, an expert may answer by saying ``Just use ports!'' and listing the following commands:
cd /usr/ports/<category>/<package> make install
to suggest that the user needs to install a particular software package. This is similar to the way many Debian experts will tell people to apt-get install <package>. It's a simple way for a user to install software and related dependencies.
The Ports system is written entirely in pmake, the version of the make utility that comes with BSD. The choice to use make is both an obvious and a novel one. Make can be thought of as a language designed to automate software compilation and has many facilities for expressing build dependencies and rule-based build actions.
On the other hand, make has very limited flow control and lacks many features traditionally found in procedural programming languages. It can be rather unwieldy when used to build large projects.
As of the time of this writing, the core Ports runtime for FreeBSD has undergone 400 revisions since 1994. You can see all of the revisions and their changelog entries at http://www.freebsd.org/cgi/cvsweb.cgi/ports/Mk/bsd.port.mk. The collection of software currently contains nearly 4,000 packages.
I had made the mistake, in 1998, of making a fuss about the GNU system needing something like Ports. I made claims about how much more elegant Ports would be if written in GNU make and spent a lot of time reading the FSF's make book and the NetBSD Ports source code. It wasn't until an LNX-BBC meeting in 2001 that someone called my bluff, and I actually had to sit down and write the thing.
GAR ostensibly stands for the Gmake Autobuild Runtime because it's a library of Makefile rules that provide Ports-like functionality to individual packages. (It's actually just named GAR because that's my favorite interjection: ``Gar!'')
From the user's perspective, the GAR system may well be a tree of carefully maintained source code, ready to compile. The reality is that the system is just a tree of directories containing Makefiles, and the only thing that's stored in the GAR system itself is the information necessary to perform the steps a user would take in compiling and installing software.
The base of the GAR directory tree contains a number of directories. These directories are package categories, and within each category is a directory for each package. Inside a package directory is (among other things) a Makefile.
By way of the GAR system libraries, this Makefile provides seven basic targets for the package: fetch, checksum, extract, patch, configure, build and install.
Thus, to install Python using the BBC's GAR tree, one would cd to lang/python and run make install. To look at the source code to netcat, one would cd to net/netcat and run make extract.
Each of these seven targets runs all previous targets, though any of them may be undefined for a given package. If you run make patch, it's the same as running
make fetch checksum extract patch
fetch: this target downloads all files and patches needed to compile the package. Typically this is a single tarball, accompanied by the occasional patch file.
checksum: uses md5sum to ensure that the downloaded files match those with which the package maintainer worked.
extract: makes sure that all of the necessary source files are available in a working directory. In some cases (such as when downloading a single C source file), this will simply copy files over.
patch: if the package has to be patched (either via third-party patches or package maintainer patches), this target will perform that step.
configure: configures the package as specified in the Makefile. It will typically run the package's underlying configuration system (such as autoconf or Imake).
build: performs the actual step of compilation.
install: puts files in the proper locations and performs any necessary mop-up work.
These targets are named after their counterparts in the BSD Ports system and behave in the same manner.
|Speed Up Your Web Site with Varnish||Jun 19, 2013|
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
- Speed Up Your Web Site with Varnish
- Containers—Not Virtual Machines—Are the Future Cloud
- RSS Feeds
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Linux Systems Administrator
- Non-Linux FOSS: libnotify, OS X Style
- Senior Perl Developer
- Technical Support Rep
- Weechat, Irssi's Little Brother
- UX Designer
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?