Open-Source Databases, Part II: PostgreSQL
There are some problems with our table definition. Although we have effectively stopped people from storing NULL values in our TEXT columns, we haven't done anything to stop them from entering empty strings. In addition, we might want to ensure that the email_address column looks at least something like an e-mail address.
We can do this by adding constraints to our columns—tiny functions that check the value being inserted or updated. If the new value doesn't fit the definition, PostgreSQL refuses to allow its insertion. Here's a new definition of our table, with some constraints defined:
CREATE TABLE People ( id SERIAL NOT NULL, first_name TEXT NOT NULL CHECK (first_name <> ''), last_name TEXT NOT NULL CHECK (last_name <> ''), email_address TEXT NOT NULL CHECK (email_address ~* '.@.+\\\.'), added_at TIMESTAMP NOT NULL DEFAULT NOW(), PRIMARY KEY(id), UNIQUE(email_address) );
If we inspect our table definition, it has changed somewhat, to include the constraints:
linux=# \d people
Table "public.people"
Column | Type | Modifiers
--------------+-----------------------------+----------------------
id | integer | not null default
nextval('people_id_seq'::regclass)
first_name | text | not null
last_name | text | not null
email_address | text | not null
added_at | timestamp without time zone | not null default now()
Indexes:
"people_pkey" PRIMARY KEY, btree (id)
"people_email_address_key" UNIQUE, btree (email_address)
Check constraints:
"people_email_address_check" CHECK (email_address ~* '.@.+\\.'::text)
"people_first_name_check" CHECK (first_name <> ''::text)
"people_last_name_check" CHECK (last_name <> ''::text)
Let's see what happens if we violate these constraints:
linux=# insert into people (first_name , last_name, email_address)
values ('', 'Lerner', 'reuven@lerner.co.il');
ERROR: new row for relation "people" violates check constraint
"people_first_name_check"
linux=# insert into people (first_name , last_name, email_address)
values ('Reuven2', 'Lerner2', 'reuven');
ERROR: new row for relation "people" violates check constraint
"people_email_address_check"Sure enough, our constraints help to ensure that our database is in order.
The most common type of constraint is a foreign key, in which one table points to another. For example:
CREATE TABLE Appointments (
person_id INTEGER NOT NULL REFERENCES People,
starting_time TIMESTAMP NOT NULL,
duration INTERVAL NOT NULL,
notes TEXT NULL
);If we try to create an appointment that refers to a non-existent person, we will be rejected:
INSERT INTO Appointments (person_id, starting_time, duration, notes)
VALUES (5000, '2007-Feb-12 13:00', interval '1 hour', 'Lunch');
ERROR: insert or update on table "appointments" violates foreign key
constraint "appointments_person_id_fkey"
DETAIL: Key (person_id)=(5000) is not present in table "people".Foreign-key constraints help in the other direction as well. If you try to drop a row to which a foreign key points, PostgreSQL will refuse the request, indicating that you must first delete the foreign key. You can adjust the rules for these constraints by setting the ON UPDATE or ON DELETE modifiers to the foreign key definition.
This list of features is just the tip of the iceberg. And that's part of the magic of PostgreSQL—out of the box, it's straightforward and easy to use, but you almost always can redefine and extend existing functionality with your own code and data. The built-in operators, along with the flexible ways in which they can be combined and further enhanced with your own functions and definitions, make for a powerful combination. I don't often use unions or intersections, but I do often use views.
For example, one of my favorite features is the ability to use subselects just about anywhere you would have a value. If you have someone's e-mail address, you can use that to INSERT a row into Appointments in a single command:
INSERT INTO Appointments (person_id, starting_time, duration, notes)
VALUES ((SELECT id
FROM People
WHERE email_address = 'reuven@lerner.co.il'),
'2007-Feb-12 13:00', interval '1 hour', 'Lunch');If the existing data types aren't enough, we can construct our own. PostgreSQL already comes with a number of existing data types, including geometric shapes, IP addresses and even ISBNs.
If we want to create more than one table with similar characteristics, we can take advantage of PostgreSQL's object-oriented features. Thus, we could have a People table and a Managers table, in which the definition of Manager inherits the characteristics of People and adds its own extensions.
You also can create your own server-side functions, in a variety of different languages—from PostgreSQL's own Pl/pgsql to specialized versions of Perl, Python, Tcl, Java, Ruby and the R statistical language. These functions can return individual values or entire tables, and can be used in triggers. You also can use these functions to rewrite the rules for inserting, updating and deleting data from a table or even a view.
But, perhaps the most important feature of all is the built-in support for transactions. Transactions are an essential part of database programming, in that they allow us to combine multiple queries into one all-or-nothing action. The classic example of a transaction is the movement of money from one bank account to another; if the power goes out, you want to be sure that the money was moved, or that it wasn't. It would be unacceptable for the money to disappear altogether or for it to appear in both accounts when the lights come back on.
Recent versions of PostgreSQL have enhanced its transactional capabilities. Not only can you commit or roll back a transaction, but you also can define savepoints inside a transaction. If something goes wrong, you can either roll back the entire transaction or merely go to the previous savepoint. Moreover, PostgreSQL now supports two-phase commits, making it possible to synchronize distributed processes that require communication and coordination.
If anything goes wrong, PostgreSQL also provides a PITR (point-in-time recovery) through a write-ahead log (WAL), ensuring that even if the power is cut off at the most critical moment, transactions will be committed or rolled back, and that as many transactions as possible will be committed.
You might have noticed that I haven't mentioned locking at all. That's because, for the most part, PostgreSQL users don't have to worry about locking. The lack of locking is handled using a system known as MVCC (multiversion concurrency control), which has only one drawback, namely the creation of many unused and cast-off database rows. The traditional way to handle this in PostgreSQL is to VACUUM the database regularly, removing old rows and clearing up space. Recent versions now include an auto-vacuum agent, reducing or even eliminating the need to VACUUM on a regular basis.
Finally, recent versions of PostgreSQL include support for tablespaces. This means you can spread tables across different directories and filesystems, rather than keep everything under the directory defined by your installation. This can boost performance or reliability significantly, particularly on large databases.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- What's the tweeting protocol?
- New Products
- RSS Feeds
- Readers' Choice Awards
- Dart: a New Web Programming Experience
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.




13 hours 18 min ago
15 hours 50 min ago
17 hours 7 min ago
17 hours 42 min ago
18 hours 5 min ago
22 hours 53 min ago
23 hours 40 min ago
1 day 1 hour ago
1 day 2 hours ago
1 day 4 hours ago