Integrating SQL with CGI, Part 2

This month we learn additional ways to retrieve data from a relational database and ways to divide our data into multiple tables for maximum efficiency.
Joining two tables

Our database now contains two tables: “postcards”, with all of the information required to assemble a complete postcard (including the ID number of the graphic we wish to display), and “graphics”, which maps those ID numbers to the names.

How do we create an association between the two? We already know that it is possible to use SQL's SELECT command to retrieve one or more columns from a table. SELECT can also be used to retrieve columns from multiple tables and can even join columns together from them.

For instance, let's get the sender name, recipient address and graphic name from our two tables, such that the graphic name for each postcard is displayed immediately next to the unique ID and recipient address.

First, let's add another graphic file to the graphics table, just for good measure:

mysql> INSERT INTO graphics (
        graphic_id,graphic_file)
        VALUES (67890, bar.gif);
Query OK, 1 rows affected (0.00 sec)
mysql> select * from graphics;
2 rows in set (0.16 sec)
+------------+--------------+
| graphic_id | graphic_file |
+------------+--------------+
|      12345 | foo.gif      |
|      67890 | bar.gif      |
+------------+--------------+

Let's now create a postcard to use for our example:

insert into postcards (id_number, sender_name,
        sender_email, recipient_name,
        recipient_email, postcard_text, graphic_id)
VALUES (99999, "Bill Clinton",
        "president@whitehouse.gov",
        "Al Gore",
        "vice.president@whitehouse.gov",
        "Please call. I have a new tax idea.",
        12345)
Notice that because they are integers, neither the postcard ID number nor the graphic ID number is surrounded by quotation marks. Also, notice how the initial list of columns in our INSERT command includes graphic_id, but leaves out graphic_text. Since graphic_text in postcards can contain null values, we can ignore it when inserting new rows into the table.

Now let's retrieve the sender name, recipient name and graphic file name for the postcard we just created:

mysql> select postcards.sender_name,
        postcards.recipient_name,graphics.graphic_file
        -> from postcards,graphics
        -> where postcards.id_number = 99999;
        2 rows in set (0.31 sec)
+--------------+----------------+--------------+
| sender_name  | recipient_name | graphic_file |
+--------------+----------------+--------------+
| Bill Clinton | Al Gore        | foo.gif      |
| Bill Clinton | Al Gore        | bar.gif      |
+--------------+----------------+--------------+

Egad—this isn't what we wanted at all. We wanted it to bring up the file name for the ID we specified, not the combination of all rows and columns in the “graphics” table with our postcard. That's what we wanted to do, but that's not what we told the computer to do. By formulating our SQL query as above, we inadvertently asked for all possible combinations of rows from postcards with rows from graphics, this combination is known in database circles as the “Cartesian product” of the two. While asking for a Cartesian product is not an error, it is almost always undesirable. When dealing with especially large tables, asking for such a combination of tables can result in a long, unnecessary computation that ties up the database server keeping it from performing other tasks.

How can we modify our database query so that it does what we originally wanted, namely, giving us the graphics file name in place of its ID number in the postcards table? The simplest way is to set up a restriction between the two tables, adding to the WHERE clause in the query, as follows:

mysql> select postcards.sender_name,
        postcards.recipient_name,graphics.graphic_file
        -> from postcards,graphics
        -> where postcards.id_number = 99999
        -> and postcards.graphic_id =
                graphics.graphic_id;
        1 rows in set (0.10 sec)
+--------------+----------------+--------------+
| sender_name  | recipient_name | graphic_file |
+--------------+----------------+--------------+
| Bill Clinton | Al Gore        | foo.gif      |
+--------------+----------------+--------------+

Now that's more like it. By requiring the equivalence between the graphic_id columns in both tables, we retrieved the information as if it came from a single table. That's part of the magic of SQL and relational databases. By combining tables in this way, you can make your data easier to handle by putting it in separate tables. However, when you retrieve the information, no one knows it came from separate tables, since a new, temporary table is returned to the caller.

This may seem like a silly example, but imagine a corporation with very strict pay scales whose payroll is handled by a relational database. If you give every employee an ID number indicating salary, you can give everyone a raise (or a reduction, depending on the company's financial state) by updating records in a salary table, rather than the table of employees. The next time you perform a join between the employee and salary tables, the new salary will be reflected automatically.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState