Database Integrity and Web Applications

Now, let's say you already have been working on a project, but you have neglected to define foreign keys. You could go through each table, figure out which is pointing to which, and then handle it accordingly, adding the sorts of migrations shown above. But the Immigrant gem (also written and released by Matthew Higgins) looks at Rails models and adds foreign keys for every has_one and has_many column that points to a belongs_to column in another model.

Additional Checks

Foreigner is a great step forward, helping ensure the integrity of your data. But there is another issue, more subtle than that of foreign keys, to which I'm still susceptible. While I have ensured that the person_id will not be NULL and will point to a record in the People table, I haven't ensured that the People table will contain valid and reasonable values. That is, although the first_name, last_name, e-mail and phone columns will not contain NULLs, they might contain empty strings or values (for example, e-mail addresses) that are invalid.

At the database level, I could handle this sort of problem with a CHECK clause. Such a clause ensures that illegal data—for whatever "illegal" values I want to define—cannot be placed in the database. This could be anything from a certain minimum or maximum length of text string, to a pattern in the text, to minimum or maximum numbers. For example, I often like to indicate that the database may not contain a price lower than 0, and that e-mail addresses need to match a very basic regular expression. (Note that matching e-mail addresses is not for the faint of heart, at least if you want to do it correctly.)

So given my People table, I could define a set of CHECK clauses that would ensure that the first_name field is non-empty. In other words, the first_name cannot be NULL, but it also cannot be the empty string:


ALTER TABLE People ADD CONSTRAINT
    people_first_name_non_empty_chks CHECK (first_name <> '');

Note that although I could add a number of checks within this single constraint, I prefer not to do so. That gives me greater flexibility to add and remove constraints, and it also ensures that when a constraint is violated, PostgreSQL will accurately tell me which one it was.

Now, how can I implement this in my Rails migration, and do I want to? My answer, as you can imagine, is that this would indeed be a good thing to include in the database.

Once again, a Ruby gem comes to the rescue. This one, sexy_pg_constraints, was written by Maxim Chernyak, but it has since forked and is maintained by several other people.

I can include it in my Rails application by adding the following to my Gemfile:


gem 'Empact-sexy_pg_constraints', :require => 'sexy_pg_constraints'

and by uncommenting the following line in config/application.rb:


config.active_record.schema_format = :sql

Simply put, sexy_pg_constraints adds a number of additional attributes that you can pass to an add_column or change_column invocation within your migration. For example, let's say I want to make sure, as before, that the first_name column is never blank—neither NULL nor the empty string. I can do this by saying:


class AddConstraints < ActiveRecord::Migration
  def up
    constrain :people, :first_name, :not_blank => true
  end

  def down
    deconstrain :people, :first_name
  end
end

After I apply this migration, I discover the following in my table definition:


"people_first_name_not_blank" CHECK (length(btrim(first_name::text)) > 0)

In other words, I am checking to ensure that after removing all whitespace from either side of the string, the length is greater than 0. Sounds like it worked to me!

sexy_pg_constraints comes with a large number of options, including whitelisting, blacklisting, matching e-mail addresses and checking the format of data. The only thing this gem is missing, by my estimate, is a way to get the model file to communicate automatically with the database and/or the migrations, so that you don't have to add these in both places manually. Even so, by adding these constraints, you improve the integrity of your data without having to go too far out of the Rails migration framework.

Conclusion

Database constraints are there to save people from themselves, and they are a great feature offered by relational databases. In exchange for a bit of work up front, and for some small performance penalties during runtime, you can ensure that your data remains intact. I would argue that doing this sort of validation is important at both the database and the application levels. A number of Ruby gems, as I explained here, make it possible to do this sort of integration within Ruby on Rails.

Resources

Information about PostgreSQL and constraints is on the Web at http://www.postgresql.org/docs/current/static/ddl-constraints.html, and this should be read by anyone interested in the subject.

The Foreigner and Immigrant gems are on GitHub at https://github.com/jenseng/foreigner and https://github.com/jenseng/immigrant, respectively.

A recent branch of the sexy_pg_constraints gem is at https://github.com/carbonfive/sexy_pg_constraints.

Database image via Shutterstock.com.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

rewritten for non-empty

Anonymous's picture

INSERT INTO Appointments (person_id, notes)
SELECT (people.id, 'Meet with myself')
FROM people
WHERE (first_name, last_name)
IN ('Reuven', 'Lerner');

Question about your SQL

Don Barthel's picture

In the second INSERT ("INSERT INTO Appointments") you assume that the person_id is "1", the key assigned by Postgres in the first INSERT ("INSERT INTO People").

Correct me if I'm wrong but you would only know that if you ran your queries against an empty database.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix