Updating Pages Automatically

Have a need to change a file on your web site on a daily or monthly basis? This month Mr. Lerner tells us how to do it.
Using Symbolic Links

The above cron-based technique works, but has some annoying side effects. For example, what happens if you decide to change the Tuesday menu on Tuesday morning? The change will not be reflected until the following Tuesday, because today.html contains the contents of file-2.html from 12:01 a.m., when the snapshot was taken.

In order to solve this problem, as well as reduce the disk space used by two copies of the program, we can use symbolic links. These look like files, but are really pointers to files, similar to Macintosh “aliases” or Windows “shortcuts”. If we create a symbolic link from today.html to file-0.html, the two file names will be equivalent for most purposes. (Other “hard” links are also available under Linux, but are more limited.)

If we want to create a symbolic link named today.html that points to file-0.html, we say

ln -s file-0.html today.html

If you want to change the link so that it points to file-1.html, remove the old link and create a new one, like this:

rm -fv today.html
ln -s file-1.html today.html
Alternatively, we can use the -f (“force”) option to ln, forcing the link assignment even if it was previously linked elsewhere:
ln -sf file-0.html today.html
If we were to do this each day, removing the old link and creating a new one, we would be doing effectively the same thing as in cron-today.pl, but with the added advantage of equating the two files. In addition, we would be saving space on the file system by pointing to the original file rather than copying it.

Listing 3 contains a short Perl program meant to be run via cron, which creates such a link. Anything sent to standard output (STDOUT) via “print” statements is sent to the owner of the cron job. This program assumes the owner of the cron job (under whose user ID the program is run) has permission to remove the existing file, as well as create a new symbolic link in the directory. It is possible to create a symbolic link to any file, including a nonexistent file; only when you try to access the file are the permissions checked.

Publishing Daily Items

The techniques we have examined so far are most useful when the same item appears each week or perhaps each month. In many cases, though, publishing on the Web involves creating a new file each day and making that available. For starters, we will look into how to create a new file each day (of the form file-1.html, as before), so that the newest file will be available by looking at today.html.

Once again, we could accomplish this with either a CGI program or a cron job, examples of which you can see in Listing 4 and Listing 5, respectively. Both programs use the same basic algorithm to find the highest-numbered file of the form file-n.html, where n is the sequential number for the file.

The key to both programs is in these lines:

if (opendir(DIR, $directory))
@files = sort by_number
   grep {/^file-[0-9]+\.html$/} readdir(DIR);
closedir DIR;

First, we open $directory, the directory in which the files exist. (If the program cannot open the directory, it logs an error.) We then read the contents of the directory DIR, using Perl's grep function to filter out any files not fitting the file-n.html pattern. Finally, we sort those files with our own by_number routine, which compares the sequential numbers rather than the full file name.

Once we have the list of files, we pick off the last element of @files, which has the highest sequential number. We can then redirect the user's browser to that file using CGI.pm's redirect method.

If we want to publish items each day, we should try a better system than this one, which depends on sequential numbers. First of all, it is easier to handle file names which mention the subject (e.g., menu.html) or the date (e.g., file-1998-06-01), rather than something named with sequential numbers, as in file-3023.html.

Secondly, arranging articles by date provides users with a natural way of navigating through archives in the future without having to depend on the site's navigation scheme. In addition, creating file names according to date rather than sequential numbers decreases the chances of error.

If you choose to use the date in the file name, as in file-1998-06-01, try to keep the date elements in year-month-day order, so that sorting file names alphanumerically will also sort them chronologically. Then, we can write a small program to select the file for today based on the date and run it each day with cron. An example is shown in Listing 6. The program logic is fairly straightforward, taking the date information from our call to localtime and piecing those elements together to create the file name.

However, problems may arise if the file for today does not exist. As I mentioned earlier, symbolic links do not have to point to files; they may point to any valid file name, even if no file by that name exists. However, if the symbolic link points to a non-existent file, users will be greeted with a dreaded “404--File not found” error upon loading today.html from our site. A more sophisticated version of this program would check to see if a file corresponding to today's date existed on the site. Such a program would then search backward (or forward, if you prefer) chronologically to find the best match for the today.html symbolic link. It could even send e-mail to the webmaster indicating that such a problem existed.