Creating and Using a Database with Perl

Perl 5 includes packages enabling your Perl scripts to create and modify databases in standard Unix formats. One of these databases can be a more efficient alternative than a flat text file (which Perl handles marvelously), and it will be compatible with other languages, like C.
Other Fun Stuff with Associative Arrays

Sometimes it is necessary to sort an associative array within a Perl script. Sorting by the key values of an associative array is done like this:

for (sort keys %phone_db) {
        print "$_ = $phone_db{$_}\n";
}

Each iteration of this loop will set the $_ scalar to a key value from the associative array provided in alphabetical order. This method works very nicely for sorting associative arrays by their keys. Sorting by an associative array's values is slightly more difficult:

sub sort_by_value {
        ( $phone_db{$a} cmp $phone_db{$b} ) || \
( $a cmp $b );
}
for (sort sort_by_value keys %phone_db) {
        print "$_ = $phone_db{$_}\n";
}
This piece of code substitutes the default routine that sort() uses to order the elements it is given with a special routine. This routine, sort_by_value, sorts the associative array first by the values, and secondly by the keys (i.e., when the two values are identical, compare their respective keys to determine which should appear first).

Keep in mind that these two methods for sorting an associative array do not actually rearrange the array in any fashion. They simply provide a way to pull every key and value pair from an associative array in a particular sorted order.

Putting It All Together

An example of how databases in Perl can be used is provided in Listing 1, a short script designed to keep a database of hits on a World Wide Web site. The script reads the NCSA HTTPD access log file, stores the information in the database and creates an HTML page that displays all the statistics for the site.

Listing 1. Example Web Site Hits Database Script

This implementation is not complete—it keeps track only of which documents were accessed and their sizes. A more complete implementation could also store information about the hosts that accessed the web server, for instance. Some method for “expiring” entries in the database after a particular time interval would be a handy feature as well.

The script begins by reading the existing database file and placing all the data into associative arrays indexed by the document file name. Next, the script reads the access log file from standard input and places the data into the associative arrays that store the statistics. Finally, the script creates an HTML page using tables to display the statistics.

Conclusion

The topics provided in this article are by no means a definitive reference guide for using the built-in database support included with Perl, but they can be used as a starting point for further experimentation and exploration.

Randy Scott is a senior Computer Engineering student at the Milwaukee School of Engineering. He been programming with Unix and C for nearly three years and has become an avid Perl fan in the last six months. Any questions or comments regarding this article can be sent to scottr@bork.com.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

hjguyguygudcfgdst

Anonymous's picture

hjguyguygudcfgdst

Nice explanation . It helped

Anonymous's picture

Nice explanation . It helped me a lot in getting started with databases. thanks a lot.

Retrieval of databases

jamesmicheallay's picture

Well, I've learned how to encode information to a database with this article, but I still don't understand database retrieval with DB_File yet. How would you write one program to encode the information and another program to retrieve it (to clear confusion because it seems like you have to encode the database everytime you want to retrieve from it, completely defeating the purpose of saving).

thanks

Pradeep Kota's picture

Nice explanation. It helped me a lot in getting started with databases. thanks a lot. :)

cheers,
Kota.

Previous comment about anonymous hash

Anonymous's picture

It appears this was a typo, should be %phone_db(), since there is no mention of this being a scalar reference of an anonymous hash, but a hash container. I am assuming this is the case, since all other examples do not use a dereference of the hash, they would have been $$phone_db{"key"}

Incorrect syntax in hash formation

Anonymous's picture

The example under the Associative Arrays heading that shows how to store an anonymous hash's reference in a scalar is incorrect; instead of $phone_db=( ... ), it should be $phone_db={ ... } (curly braces, not parens). FWIW, this is a very common misteak 8-}

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState