PostgreSQL—the Linux under the Databases
The best way to get started with PostgreSQL is to read the user manual. It is written for Postgres95 v1.0 and dates back to September 1995, but the information provided there is still useful. Other valuable sources of information are the tutorials, the man pages and the various files in the /doc directory. I will be using the examples from the tutorial in this article whenever possible.
To begin, create a database, i.e., a named container for several tables and accompanying data such as indices and views. To use the database named tutorial, type this command:
createdb tutorial
Now, you are automatically the owner of the database and have full access to it. Other users have access only if you grant them the appropriate rights.
Next, connect that database to a client program. We will use psql, which comes with PostgreSQL. If you prefer a graphical interface, there is also a Motif client available called mpsql (with pre-compiled binaries for Linux). mpsql has great editing capabilities but doesn't provide special local commands for listing existing tables and databases. It also lacks a help system that is available in psql. To use psql, type:
psql tutorial
This command provides you with a shell-like environment in which you can issue SQL commands. Due to the Readline support, you have command history and file name completion with the same key bindings as the bash shell. You can also enter local commands which are processed by the client first if you prefix them with a backslash. Enter \? to get a list of local commands. All other commands are sent directly to the back end. Commands can be typed on separate lines. They will be stored in a local buffer until you enter a line terminated with a ; (semicolon), then the buffer is sent to the back end. Help on SQL commands is available by typing \h.
SQL commands can be passed on the command line for shell scripting. The \i command reads a file from disk and executes its contents as SQL commands. Be sure to always use absolute pathnames with psql, because the back end knows nothing about the current working directory of the client.
Next, create the following two tables:
create table cities ( name text, population float8, altitude int--this is a comment ); create table capitals ( state char2 ) inherits (cities);
The text type is a string of characters of variable length. If you enter int, you get a four-byte integer value. PostgreSQL comes with 43 predefined data types including several types for time and date values, many types for geometrical objects such as point, circle and polygon and a boolean type. Arrays are also supported. All types are described in the pgbuiltin(l) manual page. If you need additional types, you can add your own. Note that identifier names have not been case sensitive since v6.1.
This example also illustrates a special feature of PostgreSQL: object inheritance. The second table inherits all the fields from the cities table and adds one more field. Later I'll show how to take advantage of that feature.
One often needed feature is missing from PostgreSQL, that is, the ability to define primary keys in the create clause. Primary keys are used to define a default sort order for the tuples and to ensure that the field with that key can't hold duplicate values. It is not supported due to the method used to store the records (or tuples). Every tuple in the database gets a unique object identifier (oid) value, which is unique not only in the table but also in the whole database. There is no way to guarantee a specific order in the table. As of version 6.0 of PostgreSQL it is possible to create a unique index, so that the same effect can be achieved with indices. To create a unique index for our example table, type:
create unique index on cities using btree (name);
There are three methods available for index creation: btree, rtree and hash. The method can be specified after the keyword “using”. Only the btree method allows multiple key indices with up to seven keys. Note that not all data types are supported by all index types. In particular, rtree indices are available only for geometric types. If no index type is specified, btree is used as the default. Indices increase the access speed to tables significantly and should be used whenever possible.
The maximum size of a tuple is 8192 bytes. In reality, it is somewhat smaller, because PostgreSQL needs some place for storing internal data. The amount of this space varies from platform to platform. If you need larger fields, use the large objects interface, which provides unlimited fields of transparent data, like MEMO or BLOB fields in other databases; however, you need special functions to access them.
To enter data in the tables, use the insert command:
INSERT INTO cities VALUES ('San Francisco',
7.24E+5, 63);
INSERT INTO cities VALUES ('Las Vegas', 2.583E+5,
2174);
INSERT INTO cities VALUES ('Mariposa', 1200,
1953);
INSERT INTO capitals VALUES ('Sacramento',
3.694E+5, 30, 'CA');
INSERT INTO capitals VALUES ('Madison', 1.913E+5,
845, 'WI');
To get the data out of the tables, use the select command. It's a very powerful command, so I will demonstrate only some of its characteristics.
-- this will return all records in the table select * from cities; select * from capitals; -- to get also the records of the -- inherited tables, use this syntax: select * from cities*; -- here are some variants to limit the returned -- data: select name, altitude from cities where altitude > 500;To change some values in the table, use the update command, which has a similar syntax as the select command:
update cities -- population grows by 10% set population = population * 1.1 where name = 'Mariposa';When you update your data regularly, you will notice that the tables grow continuously, even if you haven't added new tuples. This is not a bug—it's another special feature called time travel. PostgreSQL keeps a history of all data changes in the table. To access this data, you have to use a special qualifier:
select name, population from cities['epoch', 'now'] where name = 'Mariposa';This example will list all the values of the two fields, name and population of Mariposa, from the creation of the database up to the present. If you don't wish to retain the history data, you can delete it using the vacuum command. In the future version 7.0, the time-travel feature will vanish, but at this time you need to vacuum your databases regularly. The vacuum command also has the additional purpose of updating the internal data in order to make faster querys possible. Therefore, it is a good idea to define a cron job that runs vacuum every night.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- RSS Feeds
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Designing Electronics with Linux
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- What's the tweeting protocol?
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




6 hours 47 min ago
11 hours 14 min ago
14 hours 50 min ago
15 hours 22 min ago
17 hours 46 min ago
17 hours 49 min ago
17 hours 50 min ago
22 hours 15 min ago
1 day 6 min ago
1 day 5 hours ago