Creating a Web-Based BBS, Part 3

Mr. Lerner shows us how to add a full-text search to our BBS.
Creating the Search Query

We perform what is known as a “join” between the two tables, selecting several columns from ATFMessages and one column from ATFThreads. Joins allow us to take only the most interesting columns from two or more tables, grabbing only those matching our criteria. Always remember to set up a relationship among the tables, otherwise you will get the “Cartesian product” of the results, with each of the rows in table A matched up with each of the rows in table B. We thus avoid selection like this:

SELECT, M.thread, M.subject,,
FROM ATFMessages M, ATFThreads T

which will produce the Cartesian product. Instead, we use

SELECT, M.thread, M.subject,,
FROM ATFMessages M, ATFThreads T
AND M.thread =
which qualifies the relationship between ATFMessages (given the nickname “M” in this query) and ATFThreads (with the nickname T), then lists the resulting rows in descending date order.

We also test the value of $regexp, which is set to the value of the “regexp” radio button. If $regexp is “yes”, we use the REGEXP operator in our SQL query and perform a regular expression search. Otherwise, we escape the SQL characters % and _ and use the LIKE operator. The Perl code to make this query possible looks like this:

my $sql = "SELECT, M.thread, M.subject,, T.subject ";
$sql .= "FROM ATFMessages M, ATFThreads T ";
if ($regexp eq "yes")
    $sql .= "WHERE M.text REGEXP \"$term\" ";
    $term =~ s|%|\\\%|g;
    $term =~ s|_|\\\_|g;
    $sql .= "WHERE M.text LIKE \"%$term%\" ";
$sql .= "AND M.thread = ";
$sql .= "ORDER BY desc";

Because we build the SQL query by combining text strings, we can conditionally modify parts of the query, as we saw above.

Parsing the Search Results

Results from a SQL SELECT query are always returned in a table, in which the columns are the rows requested in the query and the rows are those matching the criteria from the query. With DBI, reading the results from a query usually means iterating through the rows from within a Perl while loop.

DBI provides a number of methods for retrieving the values returned by a SELECT, but perhaps the easiest one to understand is the simple fetchrow_array method. This method is defined for $sth, the “statement handle” through which we submit our query and retrieve its results.

Several methods can be used to retrieve the results from our SELECT, but the easiest one to understand is $sth->fetchrow_array, which returns one row from the response. Each time we invoke $sth->fetchrow_array, the next row from the response table is returned. After $sth->fetchrow_array returns the last row of the response table, it returns “false”. By putting $sth->fetchrow_array inside of a while loop, we can iterate through each of the rows in the response table.

Here, then, is the code from that iterates through the results table:

while (my @row = $sth
  ($message_id, $thread_id, $subject, $author,
    $thread_name) = @row;
  print "<li><a href=\"/cgi-bin/";
  print "$thread_id#$message_id\">$subject</a>, ";
  print "by $author in ";
"<a href=\"/cgi-bin/$thread_id\">";
  print "$thread_name</a>\n";

As you can see, we assign a number of scalars to the individual elements in @row. DBI returns NULL elements (that is, elements that lack a value, rather than the C/Perl notion of “true” being non-zero) as undefined, so you can test for a value with Perl's built-in defined function. Once we have extracted the elements of @row into a number of easy-to-identify scalars, we can then use them to print results to the user's browser.

Notice how each of the hyperlinks we create does not simply point to a thread, but also to a message. We can do this by taking advantage of named anchors within a link, which allow us to force the user's browser to scroll to a particular point. If you are unfamiliar with named anchors, here is a quick lesson: in the link, “testing” is the named anchor and points to a location in test.html marked with the tag <a name="testing">. If no such tag exists, adding the named anchor to the URL has no effect.

Because our program (discussed last month) places such a named anchor at the beginning of each message header within a thread, we can thus point users directly to the message that matched their search string, rather than to the thread.

By the way, if you are interested in getting the greatest possible speed out of your application, you might want to consider using $sth->fetchrow_arrayref rather than $sth->fetchrow_array. The difference, as you might guess from their names, is that the former method returns an array reference, while the latter returns an array.

Passing a reference will always be faster than passing an array, since it involves manipulating fewer bytes. I chose to work with $sth->fetchrow_array partly because it simplified the rest of the code, and partly because I felt that we would be handling small amounts of data anyway and that the speed difference would not be too significant.

With these two files—search-form.html and—installed on our server, we now have the ability to search through the text of any message. With a few new links to the search form from our main page, this functionality is integrated into our system.