Speeding up Database Access with mod_perl

Continuing the discussion of mod_perl, Mr. Lerner tells us about the DBI specification and the Apache::DBI module.
Retrieving Rows with SELECT

If we want to retrieve matching rows from the database, we need to modify the syntax just a bit. After all, we expect to not only receive a report on whether the database was able to perform our requested action, but also see the results.

Assuming we are connected to the database, we set $sql to our SQL query:

$sql = "SELECT id,contents FROM test_insert";

We then use the prepare method to send our query, as follows:

$sth = $dbh->prepare($sql);
The $dbh->prepare result is known as a “statement handle”, which is traditionally named $sth. Just as $dbh allows us to perform operations on the database to which we have connected, $sth allows us to perform operations on the statement we have just sent. And just as $dbh is undefined in the case of an error, so too is $sth:
&log_and_die($sth->errstr) unless $sth;
Assuming that $sth was sent to the database successfully, we tell the database to execute our query, checking for problems with the return code:
$sth->execute || &log_and_die($sth->err);
Now comes the fun part, namely iterating through each of the rows returned to us. We can find out how many rows were returned as a result of our query by checking the value of $sth->rows. We can then retrieve each of the returned rows, one by one (with one column value per row), using $sth->fetchrow. When there are no more rows to retrieve, $sth->fetchrow returns false, which means that we can use it within a “while” loop. Indeed, this is a fairly standard idiom in the DBI world:
# Loop through returned rows
while (@row = $sth->fetchrow)
{
# Grab the columns from the row
$id = $row[0];
$contents = $row[1];
# Print the ID and the contents
print "<P>$id:\"$contents\"</P>\n";
}
When we are finished with this statement, we use the finish method associated with the statement, which is analogous to the disconnect method for the database handle:
$sth->finish;
Now that we have reviewed all of this in theory, let's put it into practice. First, we will create a small table in the “test” database in MySQL, by running the mysql client program:
mysql test
Once we see the mysql> prompt, we can create our small test table:
CREATE TABLE test_insert
 (id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY
        KEY, contents VARCHAR(50) NOT NULL,
        UNIQUE (contents));
The above defines our table, test_insert, to have two columns. The first column, id, is defined to contain an unsigned integer. The integer, whose presence is mandatory (NOT NULL), is automatically incremented every time we insert a row into the table and can be used as a unique index into the table. The second column, contents, is a variable-length character string whose presence is mandatory (NOT NULL) and which cannot be repeated in another record (UNIQUE).

The CGI program in Listing 1 demonstrates all of the above, first inserting a number of rows into the table and then retrieving them. Most DBI programs are as simple as this one, although many either store or retrieve information, rather than do both.

One of the problems with a standard like DBI is that the interface follows the least common denominator. That is, there are differences between database packages in addition to their administration and speed; just about every package includes a number of non-standard SQL commands and features in order to differentiate itself from the competition. If you are interested in using such features, you might have to use the func DBI method, which enables proprietary database extensions. Of course, doing so means that your program is no longer portable to other databases, which might be a concern if you switch to another vendor.

Moving the Program to Apache::DBI

We have now compiled Apache to use mod_perl, configured a perl-bin directory for serving mod_perl programs and configured Apache to insert the Apache::DBI module for all programs within the perl-bin directory. We are all set to take our sample DBI program and use it with mod_perl.

How must we modify the program in order to get it to work with mod_perl? Actually, we needn't make any modifications at all, if we have configured our copy of Apache as described above. All we need to do is copy our program into perl-bin, set the appropriate permissions and give it a whirl. Here's what I wrote on my computer, for example:

~httpd/cgi-bin% cp dbi-demo.pl ../perl-bin/
~httpd/cgi-bin% chmod ug+x ../perl-bin/dbi-demo.pl

I changed the URL in an open browser window such that it pointed to perl-bin rather than cgi-bin, and—voilà--it all just worked.

When I first started to use mod_perl and Apache::DBI, I wasn't sure how much faster programs would run. The execution certainly seemed faster, but I wasn't sure how much of an improvement I was seeing. I decided to use Perl's Benchmark module, comparing the execution speed of two different programs. I would try to insert 100 random text strings into a database, first using a CGI program and then using an Apache::DBI version of the same program (which, as we now know, simply means a version of the program placed in the perl-bin directory).

Benchmarking is a tricky and subtle business, and there are undoubtedly factors which I neglected when calculating these results. Even so, they seem to bear witness to the amazing performance difference between CGI and mod_perl. I'm sure if I were to spend a great deal more time working on my Apache and/or MySQL configuration, I could get even better performance out of my lowly 75 MHz Pentium running Red Hat 4.2. However, the relative numbers should speak for themselves.

First, let's examine the test I performed. I used the same test_insert table in MySQL as we saw before. I then wrote a CGI program, similar to the one we saw before, which connects to the database and inserts a random value into the contents column. The resulting program is shown in Listing 2.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState