Creating and Using a Database with Perl

Perl 5 includes packages enabling your Perl scripts to create and modify databases in standard Unix formats. One of these databases can be a more efficient alternative than a flat text file (which Perl handles marvelously), and it will be compatible with other languages, like C.
Opening a Database

Databases are opened in Perl using the tie() function. This function is responsible for “joining” an associative array with a database package. Operations performed on the associative array are then translated by the database package into function calls that operate on the database file itself.

Here is an example of opening a database named “phone.db” using the DB_File database package:

tie (%phone_db, DB_File, "phone.db") ||
        die ("Cannot open phone.db");

This command binds the associative array named phone_db to the Berkeley DB database file named “phone.db”. In this example, the file must exist and must be readable by the Perl script.

Creating a Database

Creating a database is nearly as simple as opening one. The following command will create a database named “phone.db” in the current directory with the file's permissions set to read-write for the owner and read-only for everyone else. The file will be created only if it does not already exist. If the database file exists in the current directory, the database file will simply be opened for read-write access by the Perl script.

tie (%phone_db, DB_File, "phone.db", O_CREAT|O_RDWR, 0644) ||
        die ("Cannot create or open phone.db");

The O_CREAT and O_RDWR flags are the same flags used as parameters to the Unix open() system call. They specify that the file should be created if it does not exist and opened with read-write access.

Reading from the Database

Reading from the database works exactly like reading data from an associative array. If the key is known, specific records can be read from the file with an expression like:

$record = $phone_db{"Bill Smith"};

All the records in the database file can be scanned (in a seemingly random order) with something like:

while (($name, $record) = each %phone_db) {
        [ commands to process data here ]
}
During each pass through the while loop, the $name scalar variable will be set to the key value from the database, and the $record variable will be set to the data associated with the key.

Writing to the Database

New data can be written into the database by creating a new key in the associative array and setting the key's value. This is done with a command similar to:

$phone_db{"Bill Smith"} = $data;

where $data is the information to be associated with the key “Bill Smith”. Any changes made to the associative array will be written into its corresponding database file.

Deleting Items from the Database

Keys can be removed from the database in exactly the same way items are removed from an associative array in Perl—by using the delete() function. The following code removes the record in the database that refers to “Bill Smith”.

delete $phone_db{"Bill Smith"};
Closing the Database File

Changes to an associative array may not be immediately written out to the database file. To insure that changes are successfully written to the database file, the file must be closed.

Closing the database file involves un-binding the associative array from the database package. This is done with the untie() function in the following manner:

untie(%phone_db);

This closes the database file, making updates to the file if necessary. The associative array %phone_db can now no longer be used to access the records in the database.

Other Types of Databases

All of the examples provided here use the default type of Berkeley DB database, the DB_HASH type. This form of database uses a hash table (like Perl does) to store the keys and their values in the database file. Two other types of databases are provided with the Berkeley DB package: DB_BTREE and DB_RECNO.

The DB_BTREE format uses a sorted, balanced binary tree to store the key and value pairs. This format allows data to be stored and read in a sorted order as opposed to the seemingly random order the DB_HASH format produces. The default comparison routine sorts the keys in the database file in lexical order (alphabetically). The DB_File man page discusses this format in more detail and shows how to replace the default comparison routine with one of your own.

The DB_RECNO format is designed to operate on flat text files. It is bound (with tie()) to normal Perl arrays, not associative arrays. Indexing this array with a number provides the text found on that line of the database file. This format is also discussed in more detail in the DB_File man page.

The desired format of database file is specified with an additional parameter for the tie() function.

tie (%phone_db, DB_File, "phone.db", O_RDONLY, 0644, $DB_BTREE) ||
                die ("Cannot open phone.db");

This command will open the DB_BTREE database named “phone.db” in read-only access mode. If the file does not exist, the command fails.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

hjguyguygudcfgdst

Anonymous's picture

hjguyguygudcfgdst

Nice explanation . It helped

Anonymous's picture

Nice explanation . It helped me a lot in getting started with databases. thanks a lot.

Retrieval of databases

jamesmicheallay's picture

Well, I've learned how to encode information to a database with this article, but I still don't understand database retrieval with DB_File yet. How would you write one program to encode the information and another program to retrieve it (to clear confusion because it seems like you have to encode the database everytime you want to retrieve from it, completely defeating the purpose of saving).

thanks

Pradeep Kota's picture

Nice explanation. It helped me a lot in getting started with databases. thanks a lot. :)

cheers,
Kota.

Previous comment about anonymous hash

Anonymous's picture

It appears this was a typo, should be %phone_db(), since there is no mention of this being a scalar reference of an anonymous hash, but a hash container. I am assuming this is the case, since all other examples do not use a dereference of the hash, they would have been $$phone_db{"key"}

Incorrect syntax in hash formation

Anonymous's picture

The example under the Associative Arrays heading that shows how to store an anonymous hash's reference in a scalar is incorrect; instead of $phone_db=( ... ), it should be $phone_db={ ... } (curly braces, not parens). FWIW, this is a very common misteak 8-}

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix