Building Sites with Mason
When we speak of dynamically generated web content, most people immediately think of CGI, the “common gateway interface”. CGI is portable across web servers, languages and operating systems, but that portability comes at the expense of efficiency. For example, every invocation of a CGI program written in Perl results in the creation of a new process, in which the program must be compiled and then interpreted.
Programmers willing to trade speed for portability have a number of options at their disposal. Many Perl programmers have chosen to use mod_perl, which makes it possible to modify Apache's behavior using Perl modules. A Perl module invoked by mod_perl is run inside the Apache process, removing the overhead associated with starting a separate CGI process. mod_perl also caches the compiled version of invoked modules, eliminating the need to compile them each time they are run. The result is a dramatic improvement in speed, as well as the flexibility to modify Apache quickly and easily using Perl.
For all its power, mod_perl has never appealed to me for creating large, dynamic web sites. True, the increase in speed is extremely impressive, and it is not that hard to work with. As we have seen in several previous editions of this column, writing a mod_perl module can be quite easy, and it integrates smoothly into a larger web site.
When a site needs to create a large quantity of dynamic output, much of which is written and designed by non-programmers, mod_perl's power is hampered by the need to use dozens or hundreds of modules, each servicing only a single URL or directory. The solution to this problem is to integrate mod_perl with templates, which intersperse HTML-formatted text with Perl code. We have looked at templates on several previous occasions and have seen their power and flexibility.
This month, we will look at Mason, a mod_perl module written by Jonathan Swartz, which attempts to solve many of these problems. It uses templates and encourages the use of separate “components”, which can be built up to create a large, dynamically generated site. Because these components exist in separate files, Mason offers additional advantages:
It caches components, providing a bigger speed boost than that from simple templates.
It provides a complete debugging and previewing environment.
It produces output files.
I first heard about Mason nearly two years ago, and kept telling myself I would look at it one day. I finally took a serious look at Mason several months ago, and was extremely impressed with what I saw—enough that I expect to do most of my future development with it.
The number of publicly distributed Mason components is still relatively small, which makes it seem like a poorer development environment than Zope and commercial template solutions. However, this situation already appears to be changing, and the number of components will most likely increase significantly during the coming months and years.
While Mason can work as a CGI program, it works best and most easily with mod_perl. (See Resources for information on obtaining and installing Apache and mod_perl.) Retrieve the latest version of HTML::Mason from CPAN and follow the same procedure as you would for any other module, detailed in the INSTALL file that comes with it.
However, using Mason is more complicated than simply saying “use HTML::Mason”. Because it works via mod_perl, which is part of Apache, the Mason configuration must be performed when Apache starts up. This is accomplished most easily by using a PerlRequire statement in the Apache configuration file, normally called httpd.conf:
The above statement tells Apache to execute the Perl statements in /usr/local/apache/conf/mason.pl when it first starts up. Any Perl modules imported or variables declared in mason.pl are placed into a section of memory shared across all Apache processes. This can mean a substantial memory savings, since Perl and mod_perl consume a large amount of memory, and most web servers run prefork child Apache processes.
At the very least, mason.pl must create three different objects:
The Mason parser ($parser), which turns each Mason component into a Perl subroutine.
The Mason interpreter ($interp), which executes the subroutines that were created by the parser. When creating the interpreter (HTML::Mason::Interp), we name two directories in which the interpreter reads and writes information: the Mason “component root” where the Mason components sit, and the Mason “data directory” in which caching and debugging information are stored. I normally set the component root to /usr/local/apache/mason and the data directory to /usr/local/apache/masondata.
The Mason ApacheHandler ($ah), which handles the HTTP request and generates a response.
Warning: for reasons that are not entirely clear, Mason cannot handle symbolic links. Specifying a symbolic link as a directory name will lead to mysterious “File not found” errors. On my system, /usr/local is a symbolic link to /quam/local; while the Mason documentation does mention this quirk, it was not explicit enough to save me more than half an hour of installation.
A bare-bones mason.pl is shown in Listing 1. Notice how mason.pl defines the subroutine HTML::Mason::handler, which is invoked once for each incoming HTTP request. In this way, Mason is able to handle each HTTP request; the ApacheHandler takes the request and hands it to the Interpreter, which then reads a compiled component from the cache or parses it as necessary.
More advanced Mason installations use mason.pl to define all sorts of additional behavior. For example, Mason comes with a previewer/debugger component, making it possible to trace through the execution of a component and its subcomponents. It is also possible to define different ApacheHandler objects, one for each type of browser or request type.
Once our mason.pl file is installed, we must tell Apache to let HTML::Mason handle some incoming requests, rather than using the default Apache handlers. This is where we connect the component root to the Apache Handler. For example, if the component root is /usr/local/apache/mason, we can say the following:
Alias /mason /usr/local/apache/mason <Location /mason> SetHandler perl-script PerlHandler HTML::Mason </Location>
The Alias directive tells Apache to translate every URL beginning with /mason to the Mason component root, /usr/local/apache/mason. The <Location> section tells Apache that every URL beginning with /mason should then be handled by the mod_perl handler HTML::Mason.
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Introduction to MapReduce with Hadoop on Linux
- RSS Feeds
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?