Simple Access Berkeley DB Using STLdb4

 in
STLdb4 makes C++ programming with the Berkeley DB simpler and more effective.

The Berkeley DB library provides a solid implementation of both the B-Tree and Hash file structures. The implementation includes support for transactions, concurrent access of database files from multiple processes, and secondary indexing as well as logging and recovery.

In this article, I use the term database to refer to a B-Tree or Hash maintained by Berkeley DB. These databases allow rapid key to value look-ups.

The standard distribution of Berkeley DB comes with both a C and C++ API. Unfortunately, the standard Berkeley DB C++ API is a very thin wrapper neglecting modern C++ designs, such as smart pointers, standard C++ I/O streams, iterators, default arguments, operator overloading and so on. As a concrete example of the lack of reference counted smart pointers, the Berkeley DB API for Db::get(), shown in Listing 1, includes two Dbt pointers and the ownership of the memory for these is not immediately obvious.

The STLdb4 Project was created to make using the Berkeley DB from C++ easier. The STLdb4 API aims to make simple database interaction trivial while still keeping more advanced usage simple. A Berkeley DB object behaves similarly to an STL collection allowing look-ups and the setting of elements using an overloaded array operator. A full example program is shown in Listing 2. After execution, the file named with argv[1] will contain a Berkeley DB B-Tree file containing the foo-bar data pair.

The main class is the Database and the reference counted smart pointer for this class is called fh_database. This trend is used throughout STLdb4 where the smart pointer for Foo is called fh_foo. Databases can be opened either as in Listing 2 directly in the constructor or using the empty constructor and the open() or create() methods later. The main difference between open and create is that create requires a database type (for example B-Tree or Hash) and will create a new database at the given path if none exists already.

In the example in Listing 2, I don't have to close the database explicitly, because the smart pointer to the Database object will handle that for me.

Standard STL collection methods, such as empty(), size(), insert(), erase(), count(), begin(), end(), find(), upper_bound() and lower_bound(), all exist in the Database class. There are also partial versions of the latter three methods. The partial versions allow the looking up of entries with part of a key in B-Tree files. A bidirectional iterator object is returned by many of the above methods.

When storing large values in the database, using the standard I/O streams can be more efficient than using the get() method or overloaded array operator. This is because the standard I/O streams use partial read and write operations on the underlying Berkeley DB file. A standard I/O stream is obtained using the getIStream() and getIOStream() methods of the Database class.

The example in Listing 3 shows the standard C++ I/O stream interface for STLdb4. The housekeeping of performing partial I/O to the Berkeley DB file is handled by STLdb4. Accessing large chunks of data through this API maintains a low memory consumption. The API shows one of the used getIOStream() calls as having a ferris_ios first parameter. As the libferrisstreams library that STLdb4 uses offers generic I/O stream support, the ferris_ios is a backward-compatible extension of the std::ios bitfield. The extension allows specifying such things as memory mapped backing and sequential stream access to be nominated for use where supported. The output from running this example is shown in Listing 4.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

STLdb4 looks good but seems a pain to compile

Rich20B's picture

Having read the article STLdb4 looks very good, though trying to get it compiled and installed is a pain.

If anybody else is having problems then I've written my experiences of compiling on Debian Etch here: http://www.richardbishop.net/wpress/?p=23

Nice!

Anonymous's picture

After reading this article, I have been using STLdb4, and am very pleased with it! It could use a few touchups, but since you can
always get the raw BerkeleyDB pointer there's nothing I haven't been
able to accomplish yet. *SO* much nicer to work with than the existing C++ implementation. Good job!

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix