Integrating SQL with CGI, Part 2

This month we learn additional ways to retrieve data from a relational database and ways to divide our data into multiple tables for maximum efficiency.

In the last installment, we looked at ways to use a relational database server from within CGI programs. Relational databases are a great help, since they provide us with a relatively simple means of storing and retrieving information.

Everything in a relational database is stored in two-dimensional tables, with each row representing a record and each column a field within that record. When we use ASCII text files to store our information, we generally have to write our own routines for saving and retrieving that information; by structuring our information in tables, we can save ourselves quite a bit of time and energy, as well as ensure that our data will not be corrupted when multiple instances of our program attempt to modify the information.

Communication with a relational database is carried out using the Structured Query Language (SQL). Because SQL is not a programming language, we have to embed our SQL queries inside of a C or Perl program if we want to execute them.

Last month, we looked at a set of CGI programs which allowed visitors to our web site to send and receive web-based postcards. Users entered information about themselves and their postcards' recipients, along with the name of a graphic and a short text message. Each postcard was stored in the postcards table, with each row in the table representing a single postcard. By giving each row a unique ID number, our program was able to retrieve individual postcards from the database.

Indeed, the program send-postcard.pl that we examined last month demonstrated how it is possible to take inputs from an HTML form and turn them into a postcard stored in our database, as well as how to take that unique ID number and use it to display a postcard with show-postcard.pl. The unique ID number is generated by send-postcard.pl when it creates a new row in the postcards table and ensures that users can retrieve their own postcards while making it relatively difficult to find those addressed to other people.

The programs we are examining are written in a portable enough version of SQL that they should run on just about any relational database server. However, I have only tested these programs using MySQL, a nice, small relational database product for several versions of Unix, including Linux. You can find out more about MySQL on the Web at http://www.tcx.se/.

Replacing Names With Numbers

One of the problems with that version of send-postcard.pl was that it asked the user to name a graphic file to be inserted into the postcard she was sending. While this sounds like a nice idea, there were several problems with it, most notably the fact that the user could name a file that did not exist on the server. Moreover, there was no way for the user to know which graphics did exist on the server, short of writing another program that would present a directory listing.

Additionally, storing the file names for the graphics alongside the other postcard information is inefficient and can lead to synchronization problems. After all, what happens if you decide to change the name of a file from foo.gif to bar.gif? Under our current model, we would have to go through the entire “postcards” table and rename each row in which foo.gif appeared. This is not terribly difficult to do, but it means we have to think carefully before making such a change.

Finally, storing the names of the graphics files introduces the possibility of typographical errors. A user might inadvertently type fooo.gif, and thus prevent the postcard's recipient from seeing the desired graphic.

In order to solve all of these problems, we will move the names of the graphics files to a separate table, giving each graphic a unique ID number, much as we did for each postcard. With this system in place, we can then refer to graphics by number in the postcards table. We can still refer to an infinite number of graphics in this way, but we do so by using an integer, rather than a string, saving storage space on our file system. In addition, such consolidation allows us to change one or more of the graphics file name(s) or location(s) without having to worry about ruining our program's output.

Creating our Graphics Table

We can retrieve one or more rows from a table in our database using the SQL SELECT command, which returns all of the rows matching our selection criteria. Thus, if we are interested in retrieving the row with the ID number of 12345 from the postcards table, we can do so by sending the query:

select * from postcards where id_number = 12345

This will return a small table, the rows of which all have an id_number equal to “12345”. Since id_number is guaranteed to be unique, we can expect that the table returned by the SELECT command will consist of a single row and all of the columns contained in the postcards table. If id_number was not guaranteed to be unique, then SELECT might well return a number of rows from the table, rather than only one.

We can wrap SQL queries inside of programs, and CGI programs are full-fledged programs; thus, we can easily write a CGI program that:

  1. takes an argument from a user,

  2. uses that argument in the SELECT statement sent to the database server, and

  3. gives us a new SELECT statement each time we run the program.

Let's create a new table, in which the first column uniquely identifies the graphic, and the second column names the graphic. I do this using the interactive mysql program that comes with MySQL, which I enter by typing mysql<\!s>test. To create the new table, enter the following lines:

create table graphics (
        graphic_id mediumint not null primary key,
        graphic_file varchar(60) not null)

The interactive mysql program indicates that the command was executed successfully by giving us the following message:

Query OK, 0 rows affected (1.13 sec)
We have now created a table with two columns. The first, graphic_id, stores medium-sized integers, cannot be null and must be unique (“primary key” in SQL lingo). The second column stores a variable-length string containing up to 60 characters and must be non-null.

We can now insert a graphic into the graphics table with the following command:

INSERT INTO graphics (graphic_id,graphic_file)
        VALUES (12345, foo.gif);

Now if we ask for the contents of the graphics table, we get the following:

mysql> select * from graphics;
1 rows in set (0.04 sec)
+------------+--------------+
| graphic_id | graphic_file |
+------------+--------------+
|      12345 | foo.gif      |
+------------+--------------+
We have now proven it is possible to store a record of a particular graphics files, as well as to retrieve the name of that file based on a unique ID number. Now comes the hard part—making an association between the ID and the graphics file name.

First of all, we need to modify the definition of our postcards table, such that it now expects to get the ID number of a graphic, rather than the graphics file name. To create the postcards table, use the following SQL command:

create table postcards (
        id_number mediumint not null primary key,
        sender_name varchar(60) not null,
        sender_email varchar(50) not null,
        recipient_name varchar(60) not null,
        recipient_email varchar(50) not null,
        graphic_id mediumint null,
        postcard_text blob);

Notice that we have removed the graphic_name column from the table we defined last month and replaced it with a column named graphic_id—which has the same name, size and type in both of the tables. We don't need to give the columns identical names, but doing so makes it easier for us to keep track of things.

The most straightforward way to replace the old graphic_name column with the new graphic_id column is to create the table anew, as demonstrated with the above SQL command. However, this action results in the loss of any data already in our table. In the beginning of a project that uses a database, creating and destroying tables is quite normal—at least until you get the definition right. After a table has been used for a while, you probably don't wish to destroy your data.

To solve this problem, we simply add our new column (graphic_id) to the postcards table using the alter table command, which allows us to add one or more columns to an existing table. This is not the most efficient way to design a database, but it eliminates worry about programs which might be dependent on graphic_name.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState