PostgreSQL—the Linux under the Databases
The best way to get started with PostgreSQL is to read the user manual. It is written for Postgres95 v1.0 and dates back to September 1995, but the information provided there is still useful. Other valuable sources of information are the tutorials, the man pages and the various files in the /doc directory. I will be using the examples from the tutorial in this article whenever possible.
To begin, create a database, i.e., a named container for several tables and accompanying data such as indices and views. To use the database named tutorial, type this command:
Now, you are automatically the owner of the database and have full access to it. Other users have access only if you grant them the appropriate rights.
Next, connect that database to a client program. We will use psql, which comes with PostgreSQL. If you prefer a graphical interface, there is also a Motif client available called mpsql (with pre-compiled binaries for Linux). mpsql has great editing capabilities but doesn't provide special local commands for listing existing tables and databases. It also lacks a help system that is available in psql. To use psql, type:
This command provides you with a shell-like environment in which you can issue SQL commands. Due to the Readline support, you have command history and file name completion with the same key bindings as the bash shell. You can also enter local commands which are processed by the client first if you prefix them with a backslash. Enter \? to get a list of local commands. All other commands are sent directly to the back end. Commands can be typed on separate lines. They will be stored in a local buffer until you enter a line terminated with a ; (semicolon), then the buffer is sent to the back end. Help on SQL commands is available by typing \h.
SQL commands can be passed on the command line for shell scripting. The \i command reads a file from disk and executes its contents as SQL commands. Be sure to always use absolute pathnames with psql, because the back end knows nothing about the current working directory of the client.
Next, create the following two tables:
create table cities ( name text, population float8, altitude int--this is a comment ); create table capitals ( state char2 ) inherits (cities);
The text type is a string of characters of variable length. If you enter int, you get a four-byte integer value. PostgreSQL comes with 43 predefined data types including several types for time and date values, many types for geometrical objects such as point, circle and polygon and a boolean type. Arrays are also supported. All types are described in the pgbuiltin(l) manual page. If you need additional types, you can add your own. Note that identifier names have not been case sensitive since v6.1.
This example also illustrates a special feature of PostgreSQL: object inheritance. The second table inherits all the fields from the cities table and adds one more field. Later I'll show how to take advantage of that feature.
One often needed feature is missing from PostgreSQL, that is, the ability to define primary keys in the create clause. Primary keys are used to define a default sort order for the tuples and to ensure that the field with that key can't hold duplicate values. It is not supported due to the method used to store the records (or tuples). Every tuple in the database gets a unique object identifier (oid) value, which is unique not only in the table but also in the whole database. There is no way to guarantee a specific order in the table. As of version 6.0 of PostgreSQL it is possible to create a unique index, so that the same effect can be achieved with indices. To create a unique index for our example table, type:
create unique index on cities using btree (name);
There are three methods available for index creation: btree, rtree and hash. The method can be specified after the keyword “using”. Only the btree method allows multiple key indices with up to seven keys. Note that not all data types are supported by all index types. In particular, rtree indices are available only for geometric types. If no index type is specified, btree is used as the default. Indices increase the access speed to tables significantly and should be used whenever possible.
The maximum size of a tuple is 8192 bytes. In reality, it is somewhat smaller, because PostgreSQL needs some place for storing internal data. The amount of this space varies from platform to platform. If you need larger fields, use the large objects interface, which provides unlimited fields of transparent data, like MEMO or BLOB fields in other databases; however, you need special functions to access them.
To enter data in the tables, use the insert command:
INSERT INTO cities VALUES ('San Francisco', 7.24E+5, 63); INSERT INTO cities VALUES ('Las Vegas', 2.583E+5, 2174); INSERT INTO cities VALUES ('Mariposa', 1200, 1953); INSERT INTO capitals VALUES ('Sacramento', 3.694E+5, 30, 'CA'); INSERT INTO capitals VALUES ('Madison', 1.913E+5, 845, 'WI');
To get the data out of the tables, use the select command. It's a very powerful command, so I will demonstrate only some of its characteristics.
-- this will return all records in the table select * from cities; select * from capitals; -- to get also the records of the -- inherited tables, use this syntax: select * from cities*; -- here are some variants to limit the returned -- data: select name, altitude from cities where altitude > 500;To change some values in the table, use the update command, which has a similar syntax as the select command:
update cities -- population grows by 10% set population = population * 1.1 where name = 'Mariposa';When you update your data regularly, you will notice that the tables grow continuously, even if you haven't added new tuples. This is not a bug—it's another special feature called time travel. PostgreSQL keeps a history of all data changes in the table. To access this data, you have to use a special qualifier:
select name, population from cities['epoch', 'now'] where name = 'Mariposa';This example will list all the values of the two fields, name and population of Mariposa, from the creation of the database up to the present. If you don't wish to retain the history data, you can delete it using the vacuum command. In the future version 7.0, the time-travel feature will vanish, but at this time you need to vacuum your databases regularly. The vacuum command also has the additional purpose of updating the internal data in order to make faster querys possible. Therefore, it is a good idea to define a cron job that runs vacuum every night.
- Handling the workloads of the Future
- Readers' Choice Awards 2014
- diff -u: What's New in Kernel Development
- How Can We Get Business to Care about Freedom, Openness and Interoperability?
- Synchronize Your Life with ownCloud
- December 2014 Issue of Linux Journal: Readers' Choice
- Days Between Dates?
- Non-Linux FOSS: Don't Type All Those Words!
- Computing without a Computer
Editorial Advisory Panel
Thank you to our 2014 Editorial Advisors!
- Jeff Parent
- Brad Baillio
- Nick Baronian
- Steve Case
- Chadalavada Kalyana
- Caleb Cullen
- Keir Davis
- Michael Eager
- Nick Faltys
- Dennis Frey
- Philip Jacob
- Jay Kruizenga
- Steve Marquez
- Dave McAllister
- Craig Oda
- Mike Roberts
- Chris Stark
- Patrick Swartz
- David Lynch
- Alicia Gibb
- Thomas Quinlan
- Carson McDonald
- Kristen Shoemaker
- Charnell Luchich
- James Walker
- Victor Gregorio
- Hari Boukis
- Brian Conner
- David Lane