Data Modeling with Alzabo

Reuven takes a detour this month and shows you how to bridge the object-relational gap.

Over the last few months, we have been looking at server-side Java programming from a variety of perspectives. From servlets to JSPs to the Enhydra application server, we've seen several different ways to create dynamic, database-driven web sites using open-source Java technologies.

I had originally planned to continue in that vein this month, looking at Enhydra's intriguing DODS object-to-relational modeling software. DODS provides a high-level Java abstraction layer for tables in a relational database. DODS methods are translated automatically into the appropriate SQL, which is then handed to the database. The result: you see Java objects and methods, your database sees tables and SQL, and everyone is happy.

Unfortunately, the help and goodwill expressed by folks at Lutris (the corporate backer of Enhydra) were no match for the Israeli customs service and our local branch of FedEx. The CD and book with additional explanations of DODS sit in a warehouse as I write this, forcing me to take a short detour from my original plan.

However, investigating DODS for this month's article revived my interest in the subject of object-relational mappings. One of the most interesting and easy-to-use tools that I've seen for this purpose is Alzabo, a set of Perl modules that allows server-side Perl programmers to wrap their relational database schemata inside of an object. (The project is named for a creature in the science fiction work of Gene Wolfe.) I was quite impressed by what I saw and believe that many Perl programmers will be equally happy to discover such a powerful tool.

The Problem

Programmers have reaped many benefits by working with objects, from reusability to inheritance to encapsulation. But while programmers have taken to object-oriented programming in droves, object databases have been less popular for a variety of reasons. Instead, relational databases have become increasingly popular over the last few years, with huge quantities of data being placed within them. The problem, then, is how we can model our data as objects, while storing them as tables.

One possibility is to model each table as a class, each table column as an instance variable and each table row as an instance of that class. But anyone who has tried this quickly discovers it is easier said than done, particularly when creating web applications—how can we join two tables? What happens when two programs modify the same row in memory and only later commit those changes to the database? How can we ensure that changes to our class definition are reflected in the database and vice versa?

Another possibility is to read an entire table into an object instance, modifying the object and writing it out when a particular method is invoked. This works pretty well for small tables, but what happens when your tables become several megabytes (or gigabytes or terabytes) in size? Your boss might be willing to buy more memory for the web server but not if you're wasting it all reading entire tables into memory! Besides, modeling tables inside of your object means you also have to create a decent locking mechanism, complete with commits and rollbacks—something that most programmers are equipped to do.

We can easily dismiss these problems when working on a small application. But as applications and databases scale up, we want to ensure that things will work as expected. This is particularly true when creating an object-to-relational mapping system, such as Alzabo. One of my employees and I created a simple object-to-relational mapping middleware layer last year and were very happy with what we had done—until we found that we hadn't taken nearly enough corner cases into account, ending up with a mess of exceptions and default values.

Luckily for the Perl programmers among us, Dave Rolsky took the time to sit down and map out all of these problems, as well as many others. Alzabo gives us an object-oriented middleware layer that removes our need to interact directly with a database.

But Alzabo does more than provide a high-level interface to your database. It also gives you a programmatic way to modify your database schema definitions, including a browser-based table creation and maintenance tool that creates SQL for you automatically. Moreover, Alzabo can take an existing database and reverse-engineer it, allowing you to use Alzabo with existing databases as well as new ones.

Installing Alzabo

Like most Perl modules, Alzabo is available for download from CPAN. However, installing Alzabo can be more complicated than other modules, simply because Alzabo depends on many modules. Not only does Alzabo require the use of DBI (for database access) and either DBD::mysql or DBD::Pg (for PostgreSQL), but the browser-based schema-creation tool uses HTML::Mason, which in turn requires mod_perl. If all of these are installed on your system, then installing Alzabo should be relatively straightforward.

I was able to install Alzabo without too much difficulty, using the CPAN modules to download and install automatically each of the prerequisites and then Alzabo itself.

I accepted the default values for almost all of the questions asked during the software's configuration and installation, with the exception of the .mhtml suffix that Alzabo assumes you use for Mason components. I normally give Mason components the simple .html suffix; because my Apache configuration didn't know what to do with the .mhtml extension, it sent them as Content-type text/plain, displaying the Mason component's source code in my browser window. Changing the suffix of the installed Mason components to .html worked on my computer, but I could have modified my Mason or Apache configuration just as easily.

Alzabo tracks each schema in its own directory, called /usr/local/alzabo by default. Inside of this directory is a schemata directory, with a single subdirectory for each of the database schemata that Alzabo is modeling. For example, the appointments schemata would be in /usr/local/alzabo/schemas/appointments.

There were two small hitches in my Alzabo installation that I had to fix. First, I had to change the permissions /usr/local/alzabo so that my web user could read and write to it. Secondly, I had to modify my PostgreSQL startup script to include the -i option, so that clients could connect via the network. By default, most PostgreSQL installations (including RPM versions) do not turn on -i, meaning that even the most liberal configuration in pg_hba.conf (the PostgreSQL host access control file) will fail to work. While you normally can connect to PostgreSQL without the network using UNIX sockets, Alzabo always specifies a hostname, which in turn requires a network connection even on the local computer.

To install the web-based schema generator, at least one directory under your Apache server must be controlled by HTML::Mason. The Alzabo installation script will create a new/alzabo subdirectory there, along with the Mason components that create and modify the schema definitions that you create. My workstation, for instance, has all of its Mason components in /usr/local/apache/mason, which is mapped to URLs beginning with /mason. The web part of my Alzabo installation is thus in /usr/local/apache/mason/alzabo, accessible via the URL /mason/alzabo. If you have not done so already, you may wish to tell Apache (via the DirectoryIndex directive) that index.mhtml is an acceptable index page for a directory.

______________________

Webcast
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers

Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.

Learn More

Sponsored by AMD

White Paper
Red Hat White Paper: Using an Open Source Framework to Catch the Bad Guy

Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6

Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.

Learn more about catching the bad guy in this free white paper.

Learn More

Sponsored by DLT Solutions