Creating a Web-Based BBS, Part 3

Mr. Lerner shows us how to add a full-text search to our BBS.
Creating the Search Query

We perform what is known as a “join” between the two tables, selecting several columns from ATFMessages and one column from ATFThreads. Joins allow us to take only the most interesting columns from two or more tables, grabbing only those matching our criteria. Always remember to set up a relationship among the tables, otherwise you will get the “Cartesian product” of the results, with each of the rows in table A matched up with each of the rows in table B. We thus avoid selection like this:

SELECT M.id, M.thread, M.subject, M.author,
   T.subject
FROM ATFMessages M, ATFThreads T

which will produce the Cartesian product. Instead, we use

SELECT M.id, M.thread, M.subject, M.author,
   T.subject
FROM ATFMessages M, ATFThreads T
AND M.thread = T.id
ORDER BY M.date desc
which qualifies the relationship between ATFMessages (given the nickname “M” in this query) and ATFThreads (with the nickname T), then lists the resulting rows in descending date order.

We also test the value of $regexp, which is set to the value of the “regexp” radio button. If $regexp is “yes”, we use the REGEXP operator in our SQL query and perform a regular expression search. Otherwise, we escape the SQL characters % and _ and use the LIKE operator. The Perl code to make this query possible looks like this:

my $sql = "SELECT M.id, M.thread, M.subject, M.author, T.subject ";
$sql .= "FROM ATFMessages M, ATFThreads T ";
if ($regexp eq "yes")
{
    $sql .= "WHERE M.text REGEXP \"$term\" ";
}
else
{
    $term =~ s|%|\\\%|g;
    $term =~ s|_|\\\_|g;
    $sql .= "WHERE M.text LIKE \"%$term%\" ";
}
$sql .= "AND M.thread = T.id ";
$sql .= "ORDER BY M.date desc";

Because we build the SQL query by combining text strings, we can conditionally modify parts of the query, as we saw above.

Parsing the Search Results

Results from a SQL SELECT query are always returned in a table, in which the columns are the rows requested in the query and the rows are those matching the criteria from the query. With DBI, reading the results from a query usually means iterating through the rows from within a Perl while loop.

DBI provides a number of methods for retrieving the values returned by a SELECT, but perhaps the easiest one to understand is the simple fetchrow_array method. This method is defined for $sth, the “statement handle” through which we submit our query and retrieve its results.

Several methods can be used to retrieve the results from our SELECT, but the easiest one to understand is $sth->fetchrow_array, which returns one row from the response. Each time we invoke $sth->fetchrow_array, the next row from the response table is returned. After $sth->fetchrow_array returns the last row of the response table, it returns “false”. By putting $sth->fetchrow_array inside of a while loop, we can iterate through each of the rows in the response table.

Here, then, is the code from search.pl that iterates through the results table:

while (my @row = $sth
{
  ($message_id, $thread_id, $subject, $author,
    $thread_name) = @row;
  print "<li><a href=\"/cgi-bin/view-thread.pl?";
  print "$thread_id#$message_id\">$subject</a>, ";
  print "by $author in ";
  print
"<a href=\"/cgi-bin/view-thread.pl?$thread_id\">";
  print "$thread_name</a>\n";
}

As you can see, we assign a number of scalars to the individual elements in @row. DBI returns NULL elements (that is, elements that lack a value, rather than the C/Perl notion of “true” being non-zero) as undefined, so you can test for a value with Perl's built-in defined function. Once we have extracted the elements of @row into a number of easy-to-identify scalars, we can then use them to print results to the user's browser.

Notice how each of the hyperlinks we create does not simply point to a thread, but also to a message. We can do this by taking advantage of named anchors within a link, which allow us to force the user's browser to scroll to a particular point. If you are unfamiliar with named anchors, here is a quick lesson: in the link http://www.ssc.com/test.html#testing, “testing” is the named anchor and points to a location in test.html marked with the tag <a name="testing">. If no such tag exists, adding the named anchor to the URL has no effect.

Because our program view-thread.pl (discussed last month) places such a named anchor at the beginning of each message header within a thread, we can thus point users directly to the message that matched their search string, rather than to the thread.

By the way, if you are interested in getting the greatest possible speed out of your application, you might want to consider using $sth->fetchrow_arrayref rather than $sth->fetchrow_array. The difference, as you might guess from their names, is that the former method returns an array reference, while the latter returns an array.

Passing a reference will always be faster than passing an array, since it involves manipulating fewer bytes. I chose to work with $sth->fetchrow_array partly because it simplified the rest of the code, and partly because I felt that we would be handling small amounts of data anyway and that the speed difference would not be too significant.

With these two files—search-form.html and search.pl—installed on our server, we now have the ability to search through the text of any message. With a few new links to the search form from our main page, this functionality is integrated into our system.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState