Interfacing Relational Databases to the Web

This document explains how to build a database-backed web site using Apache, the PHP3 module and the PostgreSQL relational database.
Design Considerations

We have already completed the first step of the design process—deciding what we need our application to do. What remains is defining the data that our application will access in three iterations:

  1. A high-level data model

  2. A low-level data model

  3. A set of views and transactions that are legal for application users

Generally, one initially models databases in an entity-relationship diagram, which represents each table as an entity with a set of attributes, and a “join”, or query, spanning multiple tables is represented by a relationship. Even if you're not going to go to the trouble of making up an E-R diagram, you should at least consider these concepts; we shall examine the entities and relationships for the address book in prose.

The low-level model we shall be using is the relational model, which has been around since 1970 and is the model used by most commercial relational databases, including Oracle, Sybase and Informix.

Finally, we shall define the ways in which our users can see the data. Once we've completed this, we're done with the hard part and can move on to the tedious part—the implementation.

Entities and Relationships

Since the entity-relationship model is such a high-level model, some of the hairier issues of the low-level model are not yet apparent, and our data model looks quite simple. The main advantage of the E-R model is that it presents a clear picture of the database miniworld—the segment of the real world that we're modeling—and is easily comprehensible to both laypersons and software engineers. Entities are defined as follows:

  • user: This describes an address book user. This entity has a unique ID, a login name, a password, a “real name” and an e-mail address.

  • addressbook: This is a “weak entity”--a set of persons for which a particular user has information.

  • person: This describes a person for whom address information is available. This entity has a first name, a middle initial, a last name and a dynastic identifier (i.e., Jr., III, etc.). A person also has one or many of each of the following: e-mail address, postal address, telephone number and URL.

Relationships among entities are defined in this way:

  • A user owns exactly one addressbook.

  • An address book contains many persons.

  • A person is an entry in exactly one addressbook.

If you've thought about databases before, you may be asking yourself, “Why can't a person be in more than one addressbook? Can't two users know the same person?” That is the sort of design decision you must make in this process—I chose to allow each user her own, private address book and avoid the difficult issues raised by sharing records. Allowing many people to share a record is like allowing global variables in a C program that can be modified by any function—you can easily get unexpected results when one function changes the variable's value without the others knowing about it.

Moving to the Relational Model

Now we need to move our high-level model (which people can understand) to a low-level model (which our database management system can understand). This process is quite straightforward, although the database designer still has some options at this point, involving normalization. Normalization (see Resources) is a formal process to quantify and measure the quality of a relational model; as always, the tradeoff is theoretical quality versus performance.

The quick-and-dirty algorithm for moving your data model from the entity-relationship to the relational model is this:

  1. Make sure every attribute of every entity is atomic.

  2. Make a table for every attribute of an entity which can hold multiple values and define a join which ties these to the associated entity.

  3. Make tables for each entity, using as fields the attributes you haven't dealt with yet. If an entity is involved in an n:1 relationship, include the key of the record it is related to as a foreign key.

This is necessarily a simplified version of the process; it does not deal with m:n relationships or some other details. For a more complete discussion of data model conversion, see a textbook on database theory.

SQL code for the “finished” relational model can be found at ftp.linuxjournal.com/pub/lj/listings/issue67/3475.tgz. All code is released under the GNU GPL.

______________________

Webcast
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers

Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.

Learn More

Sponsored by AMD

White Paper
Red Hat White Paper: Using an Open Source Framework to Catch the Bad Guy

Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6

Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.

Learn more about catching the bad guy in this free white paper.

Learn More

Sponsored by DLT Solutions