Simple Access Berkeley DB Using STLdb4
The Berkeley DB library provides a solid implementation of both the B-Tree and Hash file structures. The implementation includes support for transactions, concurrent access of database files from multiple processes, and secondary indexing as well as logging and recovery.
In this article, I use the term database to refer to a B-Tree or Hash maintained by Berkeley DB. These databases allow rapid key to value look-ups.
The standard distribution of Berkeley DB comes with both a C and C++ API. Unfortunately, the standard Berkeley DB C++ API is a very thin wrapper neglecting modern C++ designs, such as smart pointers, standard C++ I/O streams, iterators, default arguments, operator overloading and so on. As a concrete example of the lack of reference counted smart pointers, the Berkeley DB API for Db::get(), shown in Listing 1, includes two Dbt pointers and the ownership of the memory for these is not immediately obvious.
Listing 1. Standard Berkeley DB C++ API Db::get()
#include <db_cxx.h>
int Db::get(DbTxn *txnid, Dbt *key, Dbt *data,
u_int32_t flags);The STLdb4 Project was created to make using the Berkeley DB from C++ easier. The STLdb4 API aims to make simple database interaction trivial while still keeping more advanced usage simple. A Berkeley DB object behaves similarly to an STL collection allowing look-ups and the setting of elements using an overloaded array operator. A full example program is shown in Listing 2. After execution, the file named with argv[1] will contain a Berkeley DB B-Tree file containing the foo-bar data pair.
The main class is the Database and the reference counted smart pointer for this class is called fh_database. This trend is used throughout STLdb4 where the smart pointer for Foo is called fh_foo. Databases can be opened either as in Listing 2 directly in the constructor or using the empty constructor and the open() or create() methods later. The main difference between open and create is that create requires a database type (for example B-Tree or Hash) and will create a new database at the given path if none exists already.
In the example in Listing 2, I don't have to close the database explicitly, because the smart pointer to the Database object will handle that for me.
Listing 2. STLdb4 Setting and Getting Values
#include <iostream>
#include <STLdb4/stldb4.hh>
using namespace STLdb4;
using namespace std;
int main( int argc, char** argv )
{
fh_database db = new Database(DB_BTREE, argv[1]);
db["foo"] = "bar";
cerr << "foo is set to:" << db["foo"] << endl;
return 0;
}Standard STL collection methods, such as empty(), size(), insert(), erase(), count(), begin(), end(), find(), upper_bound() and lower_bound(), all exist in the Database class. There are also partial versions of the latter three methods. The partial versions allow the looking up of entries with part of a key in B-Tree files. A bidirectional iterator object is returned by many of the above methods.
When storing large values in the database, using the standard I/O streams can be more efficient than using the get() method or overloaded array operator. This is because the standard I/O streams use partial read and write operations on the underlying Berkeley DB file. A standard I/O stream is obtained using the getIStream() and getIOStream() methods of the Database class.
The example in Listing 3 shows the standard C++ I/O stream interface for STLdb4. The housekeeping of performing partial I/O to the Berkeley DB file is handled by STLdb4. Accessing large chunks of data through this API maintains a low memory consumption. The API shows one of the used getIOStream() calls as having a ferris_ios first parameter. As the libferrisstreams library that STLdb4 uses offers generic I/O stream support, the ferris_ios is a backward-compatible extension of the std::ios bitfield. The extension allows specifying such things as memory mapped backing and sequential stream access to be nominated for use where supported. The output from running this example is shown in Listing 4.
Listing 3. Standard C++ I/O Streams for Berkeley DB Files
#include <iostream>
#include <STLdb4/stldb4.hh>
using namespace STLdb4;
using namespace Ferris;
using namespace std;
int main( int, char** )
{
fh_database db = new Database( DB_BTREE,
"/tmp/play.db" );
string data = "1234567890";
db[ "fred" ] = data;
cerr << "Initial value:" << db["fred"] << endl;
{
fh_iostream ss = db->getIOStream( "fred" );
ss << "54321";
}
cerr << "Second value:" << db["fred"] << endl;
{
fh_iostream ss = db->getIOStream( "fred" );
ss.seekp( 3 );
ss << "AAAA";
}
cerr << "post seekp value:" << db["fred"] << endl;
// truncate the iostream and write
{
Database::iterator di = db->find( "fred" );
fh_iostream oss = di.getIOStream(ios::trunc, 0);
oss << "sm";
}
cerr << "Trunc and write:" << db["fred"] << endl;
// append some more data to end of iostream
{
fh_iostream oss = db->find( "fred" )
.getIOStream( ios::ate, 0 );
oss << "AndMore";
}
cerr << "at end write value:"
<< db["fred"] << endl;
return 0;
}
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Nice article, thanks for the
3 hours 44 min ago - I once had a better way I
9 hours 30 min ago - Not only you I too assumed
9 hours 47 min ago - another very interesting
11 hours 40 min ago - Reply to comment | Linux Journal
13 hours 34 min ago - Reply to comment | Linux Journal
20 hours 28 min ago - Reply to comment | Linux Journal
20 hours 44 min ago - Favorite (and easily brute-forced) pw's
22 hours 35 min ago - Have you tried Boxen? It's a
1 day 4 hours ago - seo services in india
1 day 8 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
STLdb4 looks good but seems a pain to compile
Having read the article STLdb4 looks very good, though trying to get it compiled and installed is a pain.
If anybody else is having problems then I've written my experiences of compiling on Debian Etch here: http://www.richardbishop.net/wpress/?p=23
Nice!
After reading this article, I have been using STLdb4, and am very pleased with it! It could use a few touchups, but since you can
always get the raw BerkeleyDB pointer there's nothing I haven't been
able to accomplish yet. *SO* much nicer to work with than the existing C++ implementation. Good job!