Content Management

Keep track of updates to your web site documents with this Mason application.

Over the last few months, we have looked at Mason, the web development system written by Jonathan Swartz that combines Apache, mod_perl and HTML/Perl templates. Normally, Mason is associated with web applications, especially those that use back-end databases and produce dynamic content. Indeed, I have found Mason to be an extremely useful tool in my consulting practice, making it possible to create web sites quickly and easily.

But Mason can be used for more than simple server-side web applications. One of the most interesting Mason applications is a set of components known as the “Mason content manager” or “Mason-CM”. Mason-CM provides basic content management functionality, beginning with its ability to handle the staging of files to a production server, and continuing with built-in support for spell checking, RCS version control and editing of Mason components.

While Mason-CM is written in Mason and thus requires Apache, mod_perl and the appropriate Perl modules, it can work just fine with a static site using nothing but HTML and graphic images. In fact, I recommend that even site owners uninterested in Mason and mod_perl take a look at the Mason content manager, because it provides so many useful features in a simple, free package.

Why Content Management?

As web sites have become increasingly complex, so too have the organizations surrounding them. Whereas it used to be common for a professional site to be a one-person operation, it is now usual for even a small or medium-sized site to have at least three people: a writer/editor, a designer and a programmer. Even such a small team will eventually find multiple members trying to edit the same file simultaneously. This problem was solved years ago by version control systems, such as RCS and CVS—but the tools were designed by and for programmers, and can often be daunting for a designer or editor to use.

Moreover, the Web presents a number of challenges that are different from development in the traditional software world. For example, software is traditionally written, compiled, tested and debugged, at which point the cycle begins again until the software is released. The Web, however, works differently. As soon as an HTML file in the web document hierarchy is edited, it is immediately available to everyone on the Internet. This is good news when someone finds a mistake needing to be fixed quickly, and also means that sites can update their content on a regular basis without having to go through long processes. However, it also means that the results of editing a file—whether they are improvements or mistakes—are immediately available to anyone who happens to enter the right URL at the wrong time.

For all of these reasons, most medium- and large-scale web sites now run two web servers. The first, known as a “staging server”, is where the developers, editors and designers can write, edit and test their changes to the site. Only after files have been fully tested are they sent to the second web server, known as the “production server”.

Of course, these two servers need not be on separate computers. They merely need to be separate in some way, to keep outsiders from seeing the content as it is tested and to ensure that the staging and production servers have parallel directory structures.

Mason-CM is a set of Mason components that makes it relatively easy for anyone to set up a content management system on their own server. The user interface is not beautiful, and I had some minor problems getting it installed on my system. However, it does the job more than adequately, and makes it possible for multiple people to work on a project without stepping on each other's toes.

Installing Mason-CM

Before you can install Mason-CM, you will need to have a working copy of Apache with mod_perl and Mason installed. (If you need help installing Mason, see the last three months of “At the Forge”.) In addition, you will need to download and install several Perl modules from CPAN, including MLDBM, Image::Size, URI::Escape and File::PathConvert. Make sure to import these in your Apache configuration file (httpd.conf) with the PerlModule directive, so as to increase the amount of memory shared among the various Apache child processes.

Mason-CM comes as a gzipped tar file from the Mason home page (see Resources). The archive should be unpacked into a directory under your Mason component root. I chose to put it in /usr/local/apache/mason/cm, where /usr/local/apache/mason is my Mason component root directory. If you have unpacked the archive correctly, the /cm directory should contain a README and an INSTALL file. I will cover all of the steps needed for installation, but it is probably a good idea to read through those files just in case.

Because Mason-CM must ensure that each file is being edited by only one user at a time, and because a content-management system should be available to authorized users only, you must restrict access to the /cm directory using HTTP authentication. Mason-CM will refuse to run unless it is in a directory that has been password protected.

The easiest way to password-protect a directory is to create a .htaccess file in it. The .htaccess file overrides the default Apache configuration settings for its directory, as well as any subdirectories underneath it. For example, here is the .htaccess file for my content management system:

AuthName "Content Management System"
AuthType Basic
AuthUserFile /usr/local/apache/conf/staging-passwords
require valid-user

The AuthName directive defines the string that will be displayed to users when prompted for a password. (Without such a string, it might be hard for users to remember which system is requesting a user name and password.) The AuthUserFile directive should point to a file containing user names and encrypted passwords.

The password file should be outside of the web document root directory, so that users cannot retrieve it using their web browsers. To create or edit the password file, you will need to use the “htpasswd” program, which is installed by default in /usr/local/apache/bin.

The “require valid-user” tells Apache that without a user name and password that match the entries in AuthUserFile, the user will be denied access to the directory.

If the .htaccess file does not have any effect, check the AllowOverride directive in httpd.conf. This directive indicates which Apache configuration options can be overridden by an .htaccess file. By default, Apache servers are configured not to let .htaccess files modify directives of type AuthConfig. You can change this by putting the following section in httpd.conf:

<Directory /usr/local/apache/mason/cm>
AllowOverride AuthConfig

Finally, I found that there was a small bug in my installation of Mason-CM. Many of the components lack a .html suffix, making it difficult or impossible for Apache to identify the content as “text/html”. So even though Mason-CM components produced output in HTML-formatted text, the contents were interpreted by my browser as if they were in unformatted ASCII. To give Apache some help, I explicitly set the content type using the mod_perl “content_type” method. The following autohandler, placed inside of the /cm directory, automatically sets the content type to “text/html” for each document within the directory:

<% $m->call_next %>
Because my configuration file is set to ignore non-text files, I can be sure that the above will not accidentally force JPEG and PNG images to be rendered with a type of “text/html”.