Database Integrity and Web Applications
Want to improve the integrity of your data? Place constraints in the database, as well as in your application.
NoSQL, the catchall phrase for non-relational databases, is all the rage among Web developers. However, it's somewhat unfair and unhelpful to use the term NoSQL to describe them, given the variety of technologies involved. Even so, there are some fundamental differences between traditional relational databases and their NoSQL counterparts. For one, as the name implies, NoSQL databases don't use the standard SQL query language, and use either their own SQL-like language (for example, MongoDB) or an object-oriented API. Another difference is the lack of two-dimensional tables; whereas SQL databases operate solely with such tables, NoSQL databases eschew them in favor of name-value pairs or hash-like objects. And finally, NoSQL databases typically lack the features that led to the development of relational databases, namely transactions and data integrity.
There's no doubt that the flexibility NoSQL databases offer is attractive on a number of levels. Just as I enjoy working with dynamic languages in which I don't have to declare my variables (or their types) before I use them, it's nice to be able to store objects in my database without having to define the object structure in advance. If I want to add a new field to my Person object, I just do so, and the database magically catches up.
At the same time, there are many cases when I want the database to be tough with me and enforce the integrity of my data. That is, I want to be sure that even if I have made a mistake in my application, or if a user enters a value that shouldn't be allowed, the database won't allow that bad data to be stored. And yes, I believe that it's good to have such checks at the database level, rather than just at the application level—not only because it provides an additional guard against corrupt data, but also because the database often is accessed directly, outside the application itself. I often have to do what I refer to as "database surgery" on applications that are running for my clients, and it's always reassuring to know I cannot make a change manually that would corrupt the data.
Readers of my column know I'm a fan of both Ruby on Rails and of PostgreSQL, and I often use them together on projects. However, because Rails originally was developed under MySQL, which lacks many of the data-integrity aspects of PostgreSQL, the standard Active Record package fails to include such items as foreign keys in its base implementation. This means that although Rails will support PostgreSQL out of the box, it doesn't provide support for foreign keys, let alone data-integrity checks.
So, in this article, I look at some Ruby gems that make it possible to include foreign-key support in your Active Record models. I also describe ways you can take advantage of PostgreSQL's CHECK mechanism to ensure that your data is as safe as possible.
Where Do You Check Your Data?
Before I go into the actual code and implementation, I admit there are several philosophies of where and how to check your data. As I mentioned previously, I prefer to have the data checked at both the database and application levels, although sometimes I've been lazy and used the default in Rails that does so only at the application level.
Checking your data only at the application level is attractive in many ways. It makes the implementation easier, and it often will work just fine. It also, in the case of Rails, allows you to work in a single language (that is, Ruby) and put the integrity checks in your model files, where you're going to be using them.
But as I already said, if I have my checks only in the application, it's quite possible that when I try to access the database from outside my application, I'll end up messing things up, if only accidentally. Now, I'll admit this is the way many Web applications are designed, and it's not a fatal flaw. But it is something that I'll try to avoid in this article.
It would be a bigger mistake to try to do things the other way around—that is, to have the checks only in the database, rather than at the application level. Imagine, for example, if you were to have a "NOT NULL" constraint in the database, such that a particular column could not contain a NULL value. If you didn't protect against such things at the application layer, you might end up trying to put that data into your database. And although the database itself would not be corrupted, the application would generate an error, confusing and frustrating users.
So, upsetting and annoying as it might be, the best way to do things probably is to duplicate some work—adding the constraints on the database and then having them on the application as well. You could argue that you should have the constraints in place in a third location—in the HTML form the user often will use to submit information to the server. Fortunately, there is at least one technique for handling this sort of thing automatically. The client_side_validations gem for Rails copies the validations as best as possible, putting them into JavaScript within the user's views.
Foreign Keys
With this background, let's now assume that I want to implement a simple appointment calendar. In order to implement my calendar, I need to keep track of people (each of whom has a first name, a last name, an e-mail address and a telephone number) and appointments (which will indicate the date/time of the appointment, the person with whom I'm meeting and a note about the meeting itself).
At the database level, the tables will be fairly straightforward. This is how I would create the tables if I were doing so manually:
CREATE TABLE People (
id SERIAL,
first_name TEXT NOT NULL,
last_name TEXT NOT NULL,
email TEXT NOT NULL,
phone TEXT NOT NULL,
PRIMARY KEY(id)
);
CREATE TABLE Appointments (
id SERIAL,
person_id INTEGER REFERENCES People NOT NULL,
notes TEXT NOT NULL,
PRIMARY KEY(id)
);
The important thing here, for the purposes of this discussion, is the
REFERENCES People clause in the definition of the Appointments
table. With the REFERENCES clause in place, I can do the following:
INSERT INTO People (first_name, last_name, email, phone)
VALUES ('Reuven', 'Lerner', 'reuven@lerner.co.il',
↪'847-230-9795');
INSERT INTO Appointments (person_id, notes) VALUES (1, 'Meet with
↪myself');
Now, here's what happens if I try to delete myself from the People table:
atf=# DELETE FROM People WHERE id = 1;
ERROR: update or delete on table "people" violates foreign key
constraint "appointments_person_id_fkey" on table "appointments"
DETAIL: Key (id)=(1) is still referenced from table "appointments".
Because Appointments.person_id is a foreign key to People.id, I cannot remove a row from People if Appointments refers to it. This sort of relational integrity is a great thing for my application. No matter how you slice it, it means I cannot remove people from the system if they are scheduled for an appointment.
Rails Integration
If these tables were part of a Web application that I was developing in Ruby on Rails, I would use database migrations to create them. The migrations would look something like this (if combined into a single file):
class CreatePeopleAndAppointments < ActiveRecord::Migration
def change
create_table :people do |t|
t.string :first_name, :null => false
t.string :last_name, :null => false
t.string :email, :null => false
t.string :phone, :null => false
t.timestamps
end
create_table :appointments do |t|
t.integer :person_id, :null => false
t.text :notes, :null => false
t.timestamps
end
end
end
Now, this will give me the table definitions, but it won't give me the foreign keys, meaning that I could corrupt the database by deleting a single "People" record. In order to do that, at least in the default Rails configuration, I need to use an "execute" command in the migration to send explicit SQL to the database.
Fortunately, there is an easier way. The Foreigner gem, written by
Matthew Higgins, which works with MySQL as well as with PostgreSQL,
adds syntax to let you create and remove foreign keys within your
migrations. For example, with Foreigner active—putting it in the
Gemfile and then running bundle install—I can add a new migration
that does the following:
class AddForeignKey < ActiveRecord::Migration
def up
add_foreign_key :appointments, :people
end
def down
remove_foreign_key :appointments, :people
end
end
Sure enough, after running this migration, I have the foreign key that I was hoping for. Again, it's still important for me to have this check in my Rails model; otherwise, I easily could get myself into a situation that the database forbids but the model allows and, thus, generate a bad error for my users.
Note: if you add the foreign key after you already have populated the database with some data, you might run into trouble. That's because PostgreSQL won't let you add a foreign key if one or more rows fail to abide by the declared constraint. The NOT NULL constraint additionally ensures that you point to some person in the system. In other words, every appointment has to be with someone, and it has to be with someone in the People table.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
Web Development News
Developer Poll
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Home, My Backup Data Center
- New Products
- Paranoid Penguin - Building a Secure Squid Web Proxy, Part IV
- Trying to Tame the Tablet
- Developer Poll
- Hey God - You may not be
3 hours 58 min ago - Reply to comment | Linux Journal
6 hours 30 min ago - Drupal is an Awesome CMS and a Crappy development framework
11 hours 9 min ago - IT industry leaders
13 hours 32 min ago - Reply to comment | Linux Journal
1 day 6 hours ago - Reply to comment | Linux Journal
1 day 8 hours ago - Reply to comment | Linux Journal
1 day 10 hours ago - great post
1 day 10 hours ago - Google Docs
1 day 11 hours ago - Reply to comment | Linux Journal
1 day 15 hours ago







Comments
rewritten for non-empty
INSERT INTO Appointments (person_id, notes)
SELECT (people.id, 'Meet with myself')
FROM people
WHERE (first_name, last_name)
IN ('Reuven', 'Lerner');
Reply to comment | Linux Journal
I believe everything typed was actually very reasonable.
However, what about this? what if you added a little content?
I ain't saying your content isn't solid, but what if you added something that makes people want
more? I mean Reply to comment | Linux Journal is a little
boring. You should peek at Yahoo's front page and see how they write article titles to grab people to open the links. You might add a related video or a related picture or two to grab readers excited about what you've got to say.
Just my opinion, it might bring your posts a little bit more interesting.
Question about your SQL
In the second INSERT ("INSERT INTO Appointments") you assume that the person_id is "1", the key assigned by Postgres in the first INSERT ("INSERT INTO People").
Correct me if I'm wrong but you would only know that if you ran your queries against an empty database.