Content Management

Give your web site newspaper-grade content management with open-source software.

Remember the good ol' days of the Web? Back when a webmaster was a jack-of-all-trades, doing everything from graphic design to database programming to DNS table manipulation? As the Web matured, however, one-person web sites became increasingly rare. True, it's still relatively easy for a someone to create and maintain a simple web site, but even the smallest organizations typically split responsibility between programmers, designers and the people who provide content. Moreover, many organizations want different people to be responsible for different types of content, with each having ultimate authority over a particular section.

Of course, this is old hat to the publishing world. Back when I edited my college newspaper, we used a composition and typesetting system called Atex. Atex was beloved for many reasons but mostly because it worked the way that newspapers do. Reporters using an Atex system would send articles to their editors by pressing the Send button on a massive, specialized keyboard. Editors could look at the list of articles awaiting their attention, edit articles, send an article back to the reporter who wrote it or send articles onto the Typesetting department. By design, everyone was forbidden from viewing, modifying or retrieving articles they had sent to the next person in the process chain. The image of a reporter shouting “stop the presses!” might be romantic and inspiring, but it is also unrealistic in today's world, where newspapers are businesses with tight deadlines.

As web sites grow to resemble newspapers, we should not be surprised to see them adopting software—known as content management systems, or CMS—that works much like Atex used to do. But organizing documents, people and work flow is a difficult task, particularly if you try to put everyone's needs into a single software package. So even though content management is crucial to an increasing number of web sites, CMS salespeople have gained a reputation for selling bloated, expensive software that is large on promises and small on delivery.

What Does a CMS Do?

One of the problems with content management is that every web site has different needs. For this reason, proprietary CMS software usually is sold in two parts. The customer first pays for the basic software and then pays at least as much in consulting and support services. Thus, CMS software is not only expensive but requires a fair amount of implementation and testing time. In other words, a CMS usually is closer to a toolkit than a finished application. Most of these toolkits include the following functionality:

  • Users: if everyone on a web site is going to be given different permissions, obviously each user will need a different login. A CMS thus comes with user-management software, allowing you to create, delete, edit and ban users on the system. Most systems also make it possible for users to retrieve forgotten passwords.

  • Permissions: just as Linux allows you to set read, write and execute permissions on different files, CMS software typically allows the site administrator to define different permissions for each user on the system. To return to our newspaper analogy, reporters are allowed to enter content, editors can modify content (or return it to a reporter) and publishers can make content publicly available (or return it to an editor).

  • Groups: although you theoretically can assign permissions to individual users, this quickly becomes tedious. So, most CMS software allows you to group users together for the sake of assigning permissions. For example, you can indicate that Tom, Dick and Harry can write and edit but not publish, or you can assign these permissions to the Canonical Names group, with the same effect.

  • Templates: many templating systems exist, including JSP, HTML::Mason and PHP. The best ones separate design, content and programming logic from each other, so designers, writers and programmers can work on a site simultaneously without stepping on one another's toes.

  • Publishing: the Web's biggest double-edged sword is its instantaneousness. The moment you modify foo.html on your server, everyone can see what changes you made. What if you made a mistake? What if you want to test the file beforehand? The CMS solution is to mark each piece of content as published when it should be viewed by the outside world. Until an article has been published, it is invisible.

  • Staging or previewing: just as newspaper and magazine publishers want to see what the finished product will look like before they begin to print actual copies, web publishers want to preview their site before it is live on the Web. Thus, many sites run staging servers, identical in most ways to their production servers except they are hidden from the outside world. Testing is done on these preview servers; when the editor or publisher is satisfied, content is pushed to the production servers. A CMS almost certainly will allow you to set up your system in this way.

  • Work flow: staging is the final step in what might be a long journey from an author's workstation to a production web server. How content makes its way through the system is known as work flow, and much of what a CMS does is allow you to define and manage that work flow. Should reporters be allowed to yank stories back from their editors? How many levels of editors do you want? Where do designers fit in? Who gives the final send-off to content? All of these questions are handled by the work-flow portion of a CMS.

  • Publishing dates: the good news about the Web is that things are published instantaneously. But what if your corporation is announcing a stock split and cannot reveal that information until 9:00 AM on Monday? You could sit next to the computer, waiting until the clock strikes 9:00 to press the Enter key and revealing the document for everyone to see. Or you can use a CMS, which typically allows you to specify when an article will appear, as well as when it should expire.

  • Web-based editing: although a web browser is one of the worst possible programs to use for serious text editing, most CMS systems allow you to write some or all of your documents using your browser. To be fair, just about every CMS also lets you upload files from your local computer. Web-based editing comes in handy when you're on the run or want to touch up one or two things. Of course, any CMS that offers such editing facilities also checks that someone trying to edit a page is authorized to do so.

  • Search: most CMS packages offer some sort of search facility, so you can find documents within the system.

Although this list is by no means exhaustive, it should give you a sense of the types of problems that a CMS tries to solve. But as you can imagine, every CMS offers a slightly different set of features and different ways of attacking these problems.

Because a CMS spends much of its time storing, retrieving and tracking content, it should come as no surprise that a database is almost essential to a CMS. Commercial CMS packages typically expect you to use a proprietary database system, such as Oracle or Microsoft's SQL Server. As you might expect, open-source CMS software generally is designed to work best with open-source databases, such as MySQL or PostgreSQL. Zope's Content Management Framework (CMF), which is a toolkit for creating a custom CMS, also uses a database, but in this case, it's the built-in Zope Object Database (ZODB) rather than an external relational database.