Personalizing “New” Labels
Like many people, I spend a great deal of time on the Web. Some of that time is spent working—writing and debugging programs for various clients. I also spend quite a bit of time reading on the Web, keeping track of the latest news from the real world and the computer industry, and even exploring new sites that friends and colleagues have suggested I visit.
A common feature on web sites, one which never fails to annoy me, is the proliferation of graphics indicating which items are new. I don't mind the fact that the site's author is letting me know the most recently changed or added items. Rather, it bothers me to know that these tags indicate whether the document is new, rather than whether the document is new for me.
When I visit a site for the first time, all of the documents should have a “new” indication, since all are new to me. When I return to the site, only those added since my previous visit should have the “new” graphic, perhaps including those modified since my last visit. In other words, the site should keep track of my usage patterns, rather than force me to remember whether I have read a particular file.
This month, we will take a look at this problem. Not only will we see how to create a web site that fails to annoy me in this particular way, but we will also look at some of the trade-offs that often occur when trying to handle site maintenance, service to the end user and program maintainability.
Now that I have disparaged the practice of putting “new” labels on a web site's links, let me demonstrate it, so we can have a clear starting point. Here is a simple page of HTML with two links on it, one with a “new” graphic next to it:
<HTML> <Head><Title>Welcome to My Site</Title></Head> <Body> <H1>Welcome to My Site</H1> <P>Read <a href="resume.html">my resume.</a></P> <P>Read <a href="deathvalley.html"><img src="new.gif"> about my recent trip to Death Valley!</a></P> </Body> </HTML>
When the page's author decides enough time has passed, the “new” logo will go down. These labels are updated by modifying the HTML file, inserting or erasing the graphics as necessary.
This technique has a number of advantages, the main one being that the site requires less horsepower to run. Downloading text and graphics does not require as much of the server's processor as a CGI program, which requires additional memory as well as processing time.
However, this technique also has many disadvantages. First of all, the labels change only when the webmaster decides to modify the HTML file, rather than on an automatic basis. Secondly, the labels fail to take users' individual histories into account, meaning first-time users will see the same “new” labels as daily visitors.
How can we approach this problem? Let's begin with a simple solution that does not use personalization, but does provide more accurate labels than the above approach. We can auto-expire the labels, printing “new” during the first week a file is made available and “modified” the second week. Files more than two weeks old will not have a label.
The easiest way to do this is via server-side includes. SSIs execute as if they were CGI programs, but their output is inserted inside an HTML file. SSIs are useful when you want dynamic or otherwise programmable text inside an HTML file, but don't have enough dynamic output to justify burying the HTML inside a CGI program.
In this particular case, we can take advantage of Apache's advanced server-side include functionality, which allows us to execute a CGI program and insert its output into an HTML file. For example, we can slightly modify our file like this:
<HTML> <Head><Title>Welcome to My Site</Title></Head> <Body> <H1>Welcome to My Site</H1> <P>Read <a href="resume.html">my resume.</a></P> <P>Read <a href="deathvalley.html"> <!-#include virtual="/cgi-bin/print-label.pl?deathvalley.html" -> about my recent trip to Death Valley!</a></P> </Body> </HTML>
As you can see, the second link includes an SSI. One nice thing about SSIs is they look like HTML comments, so if you accidentally install an SSI-enabled file on a server that does not know how to parse them, the entire SSI will be ignored.
SSIs work thanks to a bit of magic: before the document is returned to the user's browser, it is interpreted by the server (hence the term “server-side includes”). Apache replaces all of the SSI commands with the result of their execution. This could mean printing something as simple as a file's modification date, but might be as complicated as inserting the results of a large database-access client invoked via CGI.
In the above example, we run the CGI program print-label.pl, the code for which is in Listing 1. While this program is run via SSI rather than a pure CGI call, it works just like a CGI program. We use CGI.pm, the standard Perl module for writing CGI programs, to retrieve the keywords parameter, which is another way of describing a parameter passed via the GET method following the question mark.
Once we have checked to make sure the file exists, we use the -M operator to ask Perl to tell us the number of days which have passed since the file was last modified . If $ctime is equal to less than 7, the file was modified within the last seven days, meaning the file should be considered “new” for our purposes. We use a font tag to tell the user that the file is new.
If we use SSI with each link on our site, the “New!” message will appear for all links less than one week old.
I considered several ways of handling errors within print-label.pl, including using Perl's die function to exit prematurely and print an error message on the user's screen. In the end, I decided the program should exit silently if the file does not exist, or if no file name is specified at all. You may wish to send a message to the error log, which can be accomplished from within a CGI program by printing to STDERR as follows:
print STDERR "No such file \"$filename\"\n";
A major problem with this arrangement is that CGI programs are inherently resource hogs. If we have ten links on a page, using this technique involves running ten CGI programs—which means launching ten new Perl processes each time we view this page. For now, we will ignore the performance implications and focus on how to get things working. I will discuss performance toward the end of this article and in greater depth next month.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
|CentOS 6.8 Released||May 27, 2016|
|Secure Desktops with Qubes: Introduction||May 27, 2016|
|Chris Birchall's Re-Engineering Legacy Software (Manning Publications)||May 26, 2016|
|ServersCheck's Thermal Imaging Camera Sensor||May 25, 2016|
|Petros Koutoupis' RapidDisk||May 24, 2016|
|The Italian Army Switches to LibreOffice||May 23, 2016|
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Secure Desktops with Qubes: Introduction
- Chris Birchall's Re-Engineering Legacy Software (Manning Publications)
- The Italian Army Switches to LibreOffice
- Linux Mint 18
- CentOS 6.8 Released
- Petros Koutoupis' RapidDisk
- ServersCheck's Thermal Imaging Camera Sensor
- Oracle vs. Google: Round 2
- The FBI and the Mozilla Foundation Lock Horns over Known Security Hole
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide