Creating and Using a Database with Perl
Databases are opened in Perl using the tie() function. This function is responsible for “joining” an associative array with a database package. Operations performed on the associative array are then translated by the database package into function calls that operate on the database file itself.
Here is an example of opening a database named “phone.db” using the DB_File database package:
tie (%phone_db, DB_File, "phone.db") ||
die ("Cannot open phone.db");
This command binds the associative array named phone_db to the Berkeley DB database file named “phone.db”. In this example, the file must exist and must be readable by the Perl script.
Creating a database is nearly as simple as opening one. The following command will create a database named “phone.db” in the current directory with the file's permissions set to read-write for the owner and read-only for everyone else. The file will be created only if it does not already exist. If the database file exists in the current directory, the database file will simply be opened for read-write access by the Perl script.
tie (%phone_db, DB_File, "phone.db", O_CREAT|O_RDWR, 0644) ||
die ("Cannot create or open phone.db");
The O_CREAT and O_RDWR flags are the same flags used as parameters to the Unix open() system call. They specify that the file should be created if it does not exist and opened with read-write access.
Reading from the database works exactly like reading data from an associative array. If the key is known, specific records can be read from the file with an expression like:
$record = $phone_db{"Bill Smith"};
All the records in the database file can be scanned (in a seemingly random order) with something like:
while (($name, $record) = each %phone_db) {
[ commands to process data here ]
}
During each pass through the while loop, the
$name scalar variable will be set to the key
value from the database, and the $record
variable will be set to the data associated with the key.
New data can be written into the database by creating a new key in the associative array and setting the key's value. This is done with a command similar to:
$phone_db{"Bill Smith"} = $data;
where $data is the information to be associated with the key “Bill Smith”. Any changes made to the associative array will be written into its corresponding database file.
Keys can be removed from the database in exactly the same way items are removed from an associative array in Perl—by using the delete() function. The following code removes the record in the database that refers to “Bill Smith”.
delete $phone_db{"Bill Smith"};
Changes to an associative array may not be immediately written out to the database file. To insure that changes are successfully written to the database file, the file must be closed.
Closing the database file involves un-binding the associative array from the database package. This is done with the untie() function in the following manner:
untie(%phone_db);
This closes the database file, making updates to the file if necessary. The associative array %phone_db can now no longer be used to access the records in the database.
All of the examples provided here use the default type of Berkeley DB database, the DB_HASH type. This form of database uses a hash table (like Perl does) to store the keys and their values in the database file. Two other types of databases are provided with the Berkeley DB package: DB_BTREE and DB_RECNO.
The DB_BTREE format uses a sorted, balanced binary tree to store the key and value pairs. This format allows data to be stored and read in a sorted order as opposed to the seemingly random order the DB_HASH format produces. The default comparison routine sorts the keys in the database file in lexical order (alphabetically). The DB_File man page discusses this format in more detail and shows how to replace the default comparison routine with one of your own.
The DB_RECNO format is designed to operate on flat text files. It is bound (with tie()) to normal Perl arrays, not associative arrays. Indexing this array with a number provides the text found on that line of the database file. This format is also discussed in more detail in the DB_File man page.
The desired format of database file is specified with an additional parameter for the tie() function.
tie (%phone_db, DB_File, "phone.db", O_RDONLY, 0644, $DB_BTREE) ||
die ("Cannot open phone.db");
This command will open the DB_BTREE database named “phone.db” in read-only access mode. If the file does not exist, the command fails.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- seo services in india
3 hours 38 min ago - For KDE install kio-mtp
3 hours 39 min ago - Evernote is much more...
5 hours 39 min ago - Reply to comment | Linux Journal
14 hours 24 min ago - Dynamic DNS
14 hours 58 min ago - Reply to comment | Linux Journal
15 hours 57 min ago - Reply to comment | Linux Journal
16 hours 47 min ago - Not free anymore
20 hours 49 min ago - Great
1 day 36 min ago - Reply to comment | Linux Journal
1 day 44 min ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
hjguyguygudcfgdst
hjguyguygudcfgdst
Nice explanation . It helped
Nice explanation . It helped me a lot in getting started with databases. thanks a lot.
Retrieval of databases
Well, I've learned how to encode information to a database with this article, but I still don't understand database retrieval with DB_File yet. How would you write one program to encode the information and another program to retrieve it (to clear confusion because it seems like you have to encode the database everytime you want to retrieve from it, completely defeating the purpose of saving).
thanks
Nice explanation. It helped me a lot in getting started with databases. thanks a lot. :)
cheers,
Kota.
Previous comment about anonymous hash
It appears this was a typo, should be %phone_db(), since there is no mention of this being a scalar reference of an anonymous hash, but a hash container. I am assuming this is the case, since all other examples do not use a dereference of the hash, they would have been $$phone_db{"key"}
Incorrect syntax in hash formation
The example under the Associative Arrays heading that shows how to store an anonymous hash's reference in a scalar is incorrect; instead of
$phone_db=( ... ), it should be$phone_db={ ... }(curly braces, not parens). FWIW, this is a very common misteak 8-}