Personalizing “New” Labels

How to let the site visitor know which documents he hasn't seen.
Problems with this Approach

The above approach has two major problems, one having to do with the user interface and the second with performance.

Let's address the user interface issue first. In short, what happens if the user reloads the page? The first time he viewed the page, the cookie was set with the META tag, regardless of whether the cookie had been set before. The next time the user loads the page, even if it is just a few seconds later, the “new” labels no longer appear, because the cookie has been set, indicating the user visited the site within the last week. We need a finer-grained method for keeping track of these labels.

The second is a more serious problem—the performance hit. In order to implement this solution, we need to invoke at least two CGI programs for each document on our system. Given how resource-hungry even the most innocent CGI programs can be, particularly when written in Perl, this adds a tremendous load to the web server. Add to this the time it takes to start up a Perl process and execute such an external program, and our users will suffer as well, unless we make a significant investment in hardware.

We can solve the user interface problem with the Text::Template module, written by Mark-Jason Dominus and recently re-released as version 1.20. This module, as is the case for most modules, is available from CPAN (see Resources) or by using the CPAN module that comes with modern installations of Perl.

Text::Template allows us to mix Perl and HTML within a file. Everything within curly braces, {}, is considered to be a Perl program. The results of the block's evaluation are inserted into the document in place of its code block. Thus, if we say

<P>This is a first paragraph.</P>
<P>{ 2 + 5; }</P>
<P>This is a second paragraph.</P>

the end user will see

This is a first paragraph.
7
This is a second paragraph.
on his or her screen.

Remember, the result of evaluating a block is not the output from that block, but rather the return result from the final line in the block. So if we say:

<P>This is a first paragraph.</P>
<P>{ print 2 + 5;}</P>
<P>This is a second paragraph.</P>

we will see

7 This is a first paragraph.
1
This is a second paragraph.
The “7” comes from evaluating “print”, while the “1” is the returned value from the final line of the embedded Perl block.

Listing 4. template.pl

In order to use Text::Template, we will need to write a small CGI program that invokes the module and parses the indicated file. The program template.pl, shown in Listing 4, does the trick simply and easily. If we install it in our CGI directory, we can then go to /cgi-bin/template.pl?file.tmpl, and the template file.tmpl will be interpreted by template.pl, then returned to the user's browser.

In order to deal with potential security problems from people specifying unusual file names, we remove any occurrences of the string “../” and ensure all file names start in the directory /usr/local/apache/share/templates/. You may want to define a different templates directory on your system.

Now that we have our templating system in place, we can rewrite our template cookie, in which contents and “new” labels are printed only when necessary. The final result is shown in Listing 5.

Listing 5. travel.html

We create the dynamic META tag with the following code:

<META HTTP-EQUIV="Set-Cookie"
CONTENT="RecentVisitor=1;
expires={scalar localtime(time + 604800}; path=/">

As you can see, this META tag contains a small Perl block that returns an appropriate expiration date. The date is set to be 604,800 seconds in the future, better known as “one week from today”.

We retrieve the cookie later in the template, just before deciding whether to print a “new” tag:

use CGI;
my $query = new CGI;
my $visited_recently =
$query->cookie('RecentVisitor');
$outputstring .= "<font color=\"red\">(New!)</font>\n"
unless $visited_recently;
$outputstring;

Notice how we can import the CGI module within a block of the template. We can then create an instance of CGI and use it to retrieve one or more cookies. We don't use CGI.pm to print output to the user's browser, since that will be done by the templating system.

Next Month

It would seem that my obsession with “new” labels has led us in all sorts of new and interesting directions. This month, we looked at cookies, server-side includes, CGI programs and HTML/Perl templates. While templates did reduce the load on the server somewhat, they still require the invocation of a CGI program, which is inherently more costly than serving a straight HTML file.

One solution to this problem is to make the labeling an inherent part of our web server. If the server could keep track of the cookies and the labeling, things would work fine. Most people don't want to mess around with their web server software to that degree; Apache might be free software that allows you to mess around with the source, but few of us are that daring.

However, as we have seen in previous installments of ATF, we can easily modify parts of the server with Perl, rather than with C, by installing the mod_perl module. While such a system still requires some code for each retrieved document, the overhead for running a Perl subroutine to Apache via mod_perl is much lower than that required for an external CGI program.

Next month, we will examine a mod_perl module that goes through the links on a page and adds a “new” label for each item new to the user accessing the site. When we're done, we will have made the web a bit better and easier for pedants like me and for users who should not have to remember when they last visited a site.

Resources

All listings referred to in this article are available by anonymous download in the file ftp.linuxjournal.com/pub/lj/listings/issue63/3473.tgz.

Reuven M. Lerner (reuven@lerner.co.il) is an Internet and Web consultant living in Haifa, Israel, who has been using the Web since early 1993. His book Core Perl will be published by Prentice-Hall in the spring. The ATF home page, including archives and discussion forums, is at http://www.lerner.co.il/atf/

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix