Data Modeling with Alzabo
Over the last few months, we have been looking at server-side Java programming from a variety of perspectives. From servlets to JSPs to the Enhydra application server, we've seen several different ways to create dynamic, database-driven web sites using open-source Java technologies.
I had originally planned to continue in that vein this month, looking at Enhydra's intriguing DODS object-to-relational modeling software. DODS provides a high-level Java abstraction layer for tables in a relational database. DODS methods are translated automatically into the appropriate SQL, which is then handed to the database. The result: you see Java objects and methods, your database sees tables and SQL, and everyone is happy.
Unfortunately, the help and goodwill expressed by folks at Lutris (the corporate backer of Enhydra) were no match for the Israeli customs service and our local branch of FedEx. The CD and book with additional explanations of DODS sit in a warehouse as I write this, forcing me to take a short detour from my original plan.
However, investigating DODS for this month's article revived my interest in the subject of object-relational mappings. One of the most interesting and easy-to-use tools that I've seen for this purpose is Alzabo, a set of Perl modules that allows server-side Perl programmers to wrap their relational database schemata inside of an object. (The project is named for a creature in the science fiction work of Gene Wolfe.) I was quite impressed by what I saw and believe that many Perl programmers will be equally happy to discover such a powerful tool.
Programmers have reaped many benefits by working with objects, from reusability to inheritance to encapsulation. But while programmers have taken to object-oriented programming in droves, object databases have been less popular for a variety of reasons. Instead, relational databases have become increasingly popular over the last few years, with huge quantities of data being placed within them. The problem, then, is how we can model our data as objects, while storing them as tables.
One possibility is to model each table as a class, each table column as an instance variable and each table row as an instance of that class. But anyone who has tried this quickly discovers it is easier said than done, particularly when creating web applications—how can we join two tables? What happens when two programs modify the same row in memory and only later commit those changes to the database? How can we ensure that changes to our class definition are reflected in the database and vice versa?
Another possibility is to read an entire table into an object instance, modifying the object and writing it out when a particular method is invoked. This works pretty well for small tables, but what happens when your tables become several megabytes (or gigabytes or terabytes) in size? Your boss might be willing to buy more memory for the web server but not if you're wasting it all reading entire tables into memory! Besides, modeling tables inside of your object means you also have to create a decent locking mechanism, complete with commits and rollbacks—something that most programmers are equipped to do.
We can easily dismiss these problems when working on a small application. But as applications and databases scale up, we want to ensure that things will work as expected. This is particularly true when creating an object-to-relational mapping system, such as Alzabo. One of my employees and I created a simple object-to-relational mapping middleware layer last year and were very happy with what we had done—until we found that we hadn't taken nearly enough corner cases into account, ending up with a mess of exceptions and default values.
Luckily for the Perl programmers among us, Dave Rolsky took the time to sit down and map out all of these problems, as well as many others. Alzabo gives us an object-oriented middleware layer that removes our need to interact directly with a database.
But Alzabo does more than provide a high-level interface to your database. It also gives you a programmatic way to modify your database schema definitions, including a browser-based table creation and maintenance tool that creates SQL for you automatically. Moreover, Alzabo can take an existing database and reverse-engineer it, allowing you to use Alzabo with existing databases as well as new ones.
Like most Perl modules, Alzabo is available for download from CPAN. However, installing Alzabo can be more complicated than other modules, simply because Alzabo depends on many modules. Not only does Alzabo require the use of DBI (for database access) and either DBD::mysql or DBD::Pg (for PostgreSQL), but the browser-based schema-creation tool uses HTML::Mason, which in turn requires mod_perl. If all of these are installed on your system, then installing Alzabo should be relatively straightforward.
I was able to install Alzabo without too much difficulty, using the CPAN modules to download and install automatically each of the prerequisites and then Alzabo itself.
I accepted the default values for almost all of the questions asked during the software's configuration and installation, with the exception of the .mhtml suffix that Alzabo assumes you use for Mason components. I normally give Mason components the simple .html suffix; because my Apache configuration didn't know what to do with the .mhtml extension, it sent them as Content-type text/plain, displaying the Mason component's source code in my browser window. Changing the suffix of the installed Mason components to .html worked on my computer, but I could have modified my Mason or Apache configuration just as easily.
Alzabo tracks each schema in its own directory, called /usr/local/alzabo by default. Inside of this directory is a schemata directory, with a single subdirectory for each of the database schemata that Alzabo is modeling. For example, the appointments schemata would be in /usr/local/alzabo/schemas/appointments.
There were two small hitches in my Alzabo installation that I had to fix. First, I had to change the permissions /usr/local/alzabo so that my web user could read and write to it. Secondly, I had to modify my PostgreSQL startup script to include the -i option, so that clients could connect via the network. By default, most PostgreSQL installations (including RPM versions) do not turn on -i, meaning that even the most liberal configuration in pg_hba.conf (the PostgreSQL host access control file) will fail to work. While you normally can connect to PostgreSQL without the network using UNIX sockets, Alzabo always specifies a hostname, which in turn requires a network connection even on the local computer.
To install the web-based schema generator, at least one directory under your Apache server must be controlled by HTML::Mason. The Alzabo installation script will create a new/alzabo subdirectory there, along with the Mason components that create and modify the schema definitions that you create. My workstation, for instance, has all of its Mason components in /usr/local/apache/mason, which is mapped to URLs beginning with /mason. The web part of my Alzabo installation is thus in /usr/local/apache/mason/alzabo, accessible via the URL /mason/alzabo. If you have not done so already, you may wish to tell Apache (via the DirectoryIndex directive) that index.mhtml is an acceptable index page for a directory.