Session Management with Mason
Over the last two months, we looked at the Perl-based web development known as Mason. Mason, written and maintained by Jonathan Swartz, is based on the idea of “components”, flexible templates that can contain HTML, Perl code, or a bit of each. Mason makes it possible to create large, dynamic web sites in a relatively short period of time. Moreover, it lends itself to easy and extensive code reuse, removing many of the common maintenance issues associated with web sites.
Because Mason is traditionally run on top of mod_perl, an Apache module that places a complete Perl binary inside of the web server, it can take advantage of other Perl modules developed for mod_perl. One particularly useful example of such a module is Apache::Session, which makes it possible to get around some of the problems associated with HTTP's statelessness.
This month, we will look at Apache::Session, using it to create a simple user registration system based on Mason. This system can make it relatively simple to create a personalized web site, connecting information in a relational database with a particular user.
HTTP was designed to be a lightweight protocol, with each transaction taking a minimum amount of time. As a result, it is fairly minimalist, with each connection consisting of a single request-response pair. (Modern versions of HTTP support multiple request-response pairs within a single transaction, but my impression is that the single-transaction version 1.0 is still the norm.)
In this model, the HTTP client connects to the server, sends a request and an optional parameter, and then one or more headers that describe the browser's capabilities. The HTTP server then returns one or more headers describing the response, followed by the response itself. The response can be an HTML-formatted text document, an image, or an error message indicating that the request could not be fulfilled. After the server sends this response, it closes the connection.
Because each HTTP transaction takes place in a vacuum, without any information from other transactions, it is difficult to keep track of a user's actions. The web has no sense of “logging in” or “logging out”, unlike a traditional computer environment. It is impossible to know whether five HTTP requests were initiated by five separate users on the same computer, or one user interested in five different URLs.
Two main techniques get around these problems. The first, called “cookies”, allows the server to store a name-value pair on the user's computer. The cookie is set in a “Set-Cookie” header at the beginning of the server's HTTP response. Every time the browser returns to a site within this server's domain, it sends a “Cookie” header as part of the request, with the name-value pair that was previously stored. Cookies are limited in length, can be deleted by a browser at any time and can easily be inspected and modified by a user.
Another technique, which we will not explore this month, involves the use of a URL's “path_info” segment. For example, consider the URL www.example.com/cgi-bin/foo.pl/abc/def. If /cgi-bin/foo.pl exists on the server, then /abc/def is passed as an additional argument that exists separately from any name-value pairs submitted from the client.
While neither cookies nor path_info is a perfect solution for the issue of state on the Web, they are sufficient for most needs. However, these solutions address only the problems with HTTP; they don't provide a means for giving our programs a sense of state.
Apache::Session bridges this gap, making it possible to associate arbitrary information along with a user. (We will soon discover that things are not quite this simple, but the overall principle is sound.) Apache::Session, which is available from CPAN, works with either cookies or path_info, and can store information using mechanisms ranging from ASCII files to relational databases. It is designed to work with mod_perl, and thus works with Mason; the documentation indicates that Apache::Session should also work under CGI, although I have not tested this claim.
Because of their versatility and speed, and because Apache::Session works best when associated with additional information in a relational database, we will use MySQL for our back end, called the “object store” in the module's documentation. In order to do this, we will need to create a table named “sessions” in our database, which looks something like this:
CREATE TABLE sessions ( id CHAR(16), length INT(11), a_session TEXT );
Apache::Session requires the table to be named sessions and that it contain three columns: an id column of type CHAR(16), a length column of type INT(11) and an a_session column of type TEXT or BLOB, which can contain any amount of binary data.
Each unique session is identified by a unique 16-character string, stored in the id column. The actual session data is stored in the a_session column, in the “nfreeze” format defined by the Storable module. (Storable is also available from CPAN.)
Practical books for the most technical people on the planet. Newly available books include:
- Agile Product Development by Ted Schmidt
- Improve Business Processes with an Enterprise Job Scheduler by Mike Diehl
- Finding Your Way: Mapping Your Network to Improve Manageability by Bill Childers
- DIY Commerce Site by Reven Lerner
Plus many more.