Several months ago, we took a long look at Mason, a Web development environment that combines mod_perl, Apache and templates. One of the examples was that of a system for press releases, in which Mason components retrieve the latest press releases from a database. The style of programming demonstrated in this article could be characterized as “two-tier”, in which the Mason components speak directly with a database, using Perl's DBI mod_perl's Apache::DBI.
But as several people on the Mason e-mail list noted, this approach—in which the SQL statements are directly inside of Mason components—is often unwise. Modifying the database definitions, or even the brand of database server we use, forces us to change the components themselves. In addition, non-web programs must re-implement SQL calls within their own programs, rather than reusing a commonly maintained and tested library.
We can solve both of these problems by adding an extra layer of software, sometimes known as “middleware”, between the database and the Mason components. This increasingly popular architecture is known as a “three-tiered” approach, because it means that we must work with three different sets of software services: the database, a “middleware” abstraction layer, and the implementation/presentation layer.
This month and next, we will look at a simple web-based address book and appointment calendar that demonstrate this three-tier approach. Along the way, I hope that you'll learn the advantages and disadvantages of this approach and be able to weigh it alongside other approaches when creating a web site. Once we have looked into this general architecture, we will be well prepared to look into Java Server Pages with Jakarta-Tomcat, and application servers. We'll examine the pitfalls involved with this approach, as well as how it can make development easier and more scalable in the long run.
The first tier, and perhaps the most significant, is the relational database. I will use PostgreSQL for this example, but it would not matter much if we used something like Oracle or MySQL.
Before we design our database, we must have a specification describing what we want to do. After all, the design of a database depends in no small part on how we intend to use it.
For the purposes of this article, we will keep the specification relatively short and ambiguous: I want to have an addressbook that I can view, search and update via the Web. In addition, I would like to be able to add, view and modify appointments using my web browser.
In order to accomplish this, we will need at least two tables, one containing people and the other containing appointments. Here is an initial stab at a definition of the people table (see Listing 1).
In other words, we will always store a person's first name, last name, country and e-mail address. Beyond that, we can store their address, city, state, postal code and some comments about them. This assumes that everyone we know has an e-mail address—an assumption that is increasingly true, but not necessarily a good attribute if you have friends and business contacts from outside of the computer industry.
Each of the entries in our people table will be uniquely identified by the person_id column, which is automatically incremented by PostgreSQL. In addition, we ensure that we only add each person once by checking for the uniqueness of their e-mail address. This allows us to have more than one friend named John Smith. It also means that we cannot store separate information about a couple with a shared e-mail address. There also is no good provision for handling people with multiple e-mail addresses.
Adding a new person into the people table is relatively easy:
INSERT INTO People (first_name, last_name, address1, address2, email, city, state, postal_code, country, comments) VALUES ('Shai', 'Re\'em', '10 Helmonit', 'Apt. 7', 'firstname.lastname@example.org', 'Modi\'in', NULL, 71700, 'Israel', 'Six-year-old nephew') ;
Because most of the columns are NULL, I can even get away with entering a bare minimum of columns:
INSERT INTO People (first_name, last_name, email, country) VALUES ('Hadar', 'Re\'em', 'email@example.com', 'Israel') ;Now we will create an appointments table in which we will store appointments with members of the people table:
CREATE TABLE Appointments ( person_id INT4 NOT NULL REFERENCES People, start_time TIMESTAMP NOT NULL, end_time TIMESTAMP NOT NULL, notes TEXT NULL CHECK (notes <>''), UNIQUE(start_time) );Once I have the appointments table defined, I can add a new appointment by inserting a new row into appointments:
INSERT INTO Appointments (person_id, start_time, end_time, notes) VALUES (1, 'November 22, 2000 19:00', 'November 22, 2000 19:30', 'Read Dr. Seuss') ;But because person_id is defined so as to be a foreign key from people, we can only add an appointment if we are meeting with someone already in the people table. This might be adequate for our purposes, but a more sophisticated (and well-specified) system would presumably give us more flexibility. And of course, this database will not let me indicate that I am meeting with more than one person at a time.
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Validate an E-Mail Address with PHP, the Right Way
- Technical Support Rep
- Senior Perl Developer
- UX Designer
- Introduction to MapReduce with Hadoop on Linux
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?