Database Replication with Slony-I
April 28th, 2005 by Ludovic Marcotte in
Database management systems have been a crucial component of infrastructures for many years now. PostgreSQL is an advanced, object-relational database management system that is frequently used to provide such services. Although this database management system has proven to be stable for many years, the two available open-source replication solutions, rserv and ERServer, had serious limitations and needed replacement.
Fortunately, such a replacement recently became available. Slony-I is a trigger-based master to multiple slaves replication system for PostgreSQL being developed by Jan Wieck. This enterprise-level replication solution works asynchronously and offers all key features required by data centers. Among the key Slony-I usage scenarios are:
Database replication from the head office to various branches to reduce bandwidth usage or speed up database requests.
Database replication to offer load balancing in all instances. This can be particularly useful for report generators or dynamic Web sites.
Database replication to offer high availability of database services.
Hot backup using a standby server or upgrades to a new release of PostgreSQL.
This article walks you through the steps required to install Slony-I and replicate a simple database located on the same machine. It also describes how Slony-I can be combined with high-availability solutions to provide automatic failover.
To install Slony-I and replicate a simple database, first install PostgreSQL from source. Slony-I supports PostgreSQL 7.3.2 or higher; 7.4.x and 8.0 need the location of the PostgreSQL source tree when being compiled. If you prefer using PostgreSQL packages from your favorite distribution, simply rebuild them from the package sources and keep the package build location intact so it can be used when compiling Slony-I. That said, obtain the latest Slony-I release, which is 1.0.5, compile and install it. To do so, proceed with the following commands:
% tar -zxvf slony1-1.0.5.tar.gz % cd slony1-1.0.5 % ./configure \ --with-pgsourcetree=/usr/src/redhat/BUILD/postgresql-7.4.5 % make install
In this example, we tell the Slony-I's configure script to look in /usr/src/redhat/BUILD/postgresql-7.4.5/ for the location of the PostgreSQL sources, the directory used when building the PostgreSQL 7.4.5 RPMs on Red Hat Enterprise Linux. The last command compiles Slony-I and installs the following files:
$postgresql_bindir/slonik: the administration and configuration script utility of Slony-I. slonik is a simple tool, usually embedded in shell scripts, used to modify Slony-I replication systems. It supports its own format-free command language described in detail in the Slonik Command Summary document.
$postgresql_bindir/slon: the main replication engine. This multithreaded engine makes use of information from the replication schema to communicate with other engines, creating the distributed replication system.
$postgresql_libdir/slony1_funcs.so: the C functions and triggers.
$postgresql_libdir/xxid.so: additional datatype to store transaction IDs safely.
$postgresql_datadir/slony1_base.sql: replication schema.
$postgresql_datadir/slony1_base.v73.sql.
$postgresql_datadir/slony1_base.v74.sql.
$postgresql_datadir/slony1_funcs.sql: replication functions.
$postgresql_datadir/slony1_funcs.v73.sql.
$postgresql_datadir/slony1_funcs.v74.sql.
$postgresql_datadir/xxid.v73.sql: a script used to load the additional datatype previously defined.
Generally, $postgresql_bindir points to /usr/bin/, $postgresql_libdir to /usr/lib/pgsql/ and $postgresql_datadir to /usr/share/pgsql/. Use the pg_config --configure command to display the parameters used when PostgreSQL was built to find the various locations for your own installation. Those files are all that is needed to offer a complete replication engine for PostgreSQL.
As you can see in Figure 1, Slony-I's main replication engine, slon, makes use of many threads. The synchronization thread verifies at a configurable interval if there has been replicable database activity, generating SYNC events if such activity happens. The local listen thread listens for new configuration events and modifies the cluster configuration and the in-memory configuration of the slon process accordingly.
As its name implies, the cleanup thread performs maintenance on the Slony-I schema, like removing old events or vacuuming the tables. The remote listen thread connects to the remote node's database to receive events from its event provider. When it receives events or confirmations, it selects the corresponding information and feeds the internal message queue of the remote workers thread. The replication data is combined into groups of transactions. The remote workers thread, one per remote node, does the actual data replication, events storing and generation of confirmations. At any moment, the slave knows exactly what groups of transactions it has consumed.
We first create the database we will replicate. This database contains a single table and sequence. Let's create a user contactuser, the contactdb database and activate the plpgsql programming language to this newly created PostgreSQL database by proceeding with the following commands:
% su - postgres % createuser --pwprompt contactuser Enter password for user "contactuser": (specify a password) Enter it again: Shall the new user be allowed to create databases? (y/ n) y Shall the new user be allowed to create more new users? (y/ n) n % createdb -O contactuser contactdb % createlang -U postgres -h localhost plpgsql \ contactdb
Then, we create the sequence and the table in the database we will replicate and insert some information in the table:
% psql -U contactuser contactdb
contactdb=> create sequence contact_seq start with 1;
contactdb=> create table contact (
cid int4 primary key,
name varchar(50),
address varchar(255),
phonenumber varchar(15)
);
contactdb=> insert into contact (cid, name, address,
phonenumber) values ((select nextval('contact_seq')),
'Joe', '1 Foo Street', '(592) 471-8271');
contactdb=> insert into contact (cid, name, address,
phonenumber) values ((select nextval('contact_seq')),
'Robert', '4 Bar Roard', '(515) 821-3831');
contactdb=> \q
For the sake of simplicity, let's create a second database on the same system in which we will replicate the information from the contactdb database. Proceed with the following commands to create the database, add plpgsql programming language support and import the schema without any data from the contactdb database:
% su - postgres % createdb -O contactuser contactdb_slave % createlang -U postgres -h localhost plpgsql \ contactdb_slave % pg_dump -s -U postgres -h localhost contactdb | \ psql -U postgres -h localhost contactdb_slave
Once the databases are created, we are ready to create our database cluster containing a master and a single slave. Create the Slonik cluster_setup.sh script and execute it. Listing 1 shows the content of the cluster_setup.sh script.
The first slonik command (cluster name) of Listing 1 defines the namespace where all Slony-I-specific functions, procedures, tables and sequences are defined. In Slony-I, a node is a collection of a database and a slon process, and a cluster is a collection of nodes, connected using paths between each other. Then, the connection information for node 1 and 2 is specified, and the first node is initialized (init cluster). Once completed, the script creates a new set to replicate, which is essentially a collection containing the public.contact table and the public.contact_seq sequence. After the creation of the set, the script adds the contact table to it and the contact_seq sequence. The store node command is used to initialize the second node (id = 2) and add it to the cluster (sql_cluster). Once completed, the scripts define how the replication system of node 2 connects to node 1 and how node 1 connects to node 2. Finally, the script tells both nodes to listen for events (store listen) for every other node in the system.
Once the script has been executed, start the slon replication processes. A slon process is needed on the master and slave nodes. For our example, we start the two required processes on the same system. The slon processes must always be running in order for the replication to take place. If for some reason they must be stopped, simply restarting allows them to continue where they left off. To start the replication engines, proceed with the following commands:
% slon sql_cluster "dbname=contactdb user=postgres" & % slon sql_cluster "dbname=contactdb_slave user=postgres" &
Next, we need to subscribe to the newly created set. Subscribing to the set causes the second node, the subscriber, to start replicating the information of the contact table and contact_seq sequence from the first node. Listing 2 shows the content of the subscription script.
Much like Listing 1, subscribe.sh starts by defining the cluster namespace and the connection information for the two nodes. Once completed, the subscribe set command causes the first node to start replicating the set containing a single table and sequence to the second node using the slon processes.
Once the subscribe.sh script has been executed, connect to the contactdb_slave database and examine the content of the contact table. At any moment, you should see that the information was replicated correctly:
% psql -U contactuser contactdb_slave contactdb_slave=> select * from contact; cid | name | address | phonenumber -----+--------+--------------+---------------- 1 | Joe | 1 Foo Street | (592) 471-8271 2 | Robert | 4 Bar Roard | (515) 821-3831
Now, connect to the /contactdb/ database and insert a row:
% psql -U contact contactdb
contactdb=> begin; insert into contact (cid, name,
address, phonenumber) values
((select nextval('contact_seq')), 'William',
'81 Zot Street', '(918) 817-6381'); commit;
If you examine the content of the contact table of the contactdb_slave database once more, you will notice that the row was replicated. Now, delete a row from the /contactdb/ database:
contactdb=> begin; delete from contact where cid = 2; commit;
Again, by examining the content of the contact table of the contactdb_slave database, you will notice that the row was removed from the slave node correctly.
Instead of comparing the information for contactdb and contactdb_slave manually, we easily can automate this process with a simple script, as shown in Listing 3. Such a script could be executed regularly to ensure that all nodes are in sync, notifying the administrator if that is no longer the case.
Although replicating a database on the same system isn't of much use, this example shows how easy it is to do. If you want to experiment with a replication system on nodes located on separate computers, you simply would modify the DB2, H1 and H2 environment variables from Listing 1 to 3. Normally, DB2 would be set to the same value as DB1, so an application always refers to the same database name. The host environment variables would need to be set to the fully qualified domain name of the two nodes. You also would need to make sure that the slon processes are running on both computers. Finally, it is good practice to synchronize the clocks of all nodes using ntpd or something similar.
Later, if you want to add more tables or sequences to the initial replication set, you can create a new set and use the merge set slonik command. Alternatively, you can use the set move table and set move sequence commands to split the set. Refer to the Slonik Command summary for more information on this.
In case of a failure from the master node, due to an operating system crash or hardware problem, for example, Slony-I does not provide any automatic capability to promote a slave node to become a master. This is problematic because human intervention is required to promote a node, and applications demanding highly available database services should not depend on this. Luckily, plenty of solutions are available that can be combined with Slony-I to offer automatic failover capabilities. The Linux-HA Heartbeat program is one of them.
Consider Figure 2, which shows a master and slave node connected together using an Ethernet and serial link. In this configuration, the Heartbeat is used to monitor the node's availability through those two links. The application makes use of the database services by connecting to PostgreSQL through an IP alias, which is activated on the master node by the Heartbeat. If the Heartbeat detects that the master node has failed, it brings the IP alias up on the slave node and executes the slonik script to promote the slave as the new master.
The script is relatively simple. Listing 4 shows the content of the script that would be used to promote a slave node, running on slave.example.com, so it starts offering all the database services that master.example.com offered.
From Listing 4, the failover Slonik command is used to indicate that the node with id = 1, the node running on master.example.com, has failed, and that the node with id = 2 will take over all sets from the failed node. The second command, drop node, is used to remove the node with id = 1 from the replication system completely. Eventually, you might want to bring back the failed node in the cluster. To do this, you must configure it as a slave and let Slony-I replicate any missing information. Eventually, you can proceed with a switchback to the initial master node by locking the set (lock set), waiting for all events to complete (wait for event), moving the set to a new origin (move set) and waiting for a confirmation that the last command has completed. Please refer to the Slonik Command Summary for more information on those commands.
Replicating databases using Slony-I is relatively simple. Combined with the Linux-HA Heartbeat, this allows you to offer high availability of your database services. Although the combination of Slony-I and Linux HA-Heartbeat is an attractive solution, it is important to note that this is not a substitute for good hardware for your database servers.
Even with its small limitations, like not being able to propagate schema changes or replicate large objects, Slony-I is a great alternative to both rserv and ERServer and is now, in fact, the preferred solution for replicating PostgreSQL databases. Slony-II even supports synchronous multimaster replication and is already on the design table.
To conclude, I would like to thank Jan Wieck, the author of Slony-I, for reviewing this article.
Resources for this article: /article/8202.
Special Magazine Offer -- 2 Free Trial Issues!
Receive 2 free trial issues of Linux Journal as well as instant online access to current and past issues. There's NO RISK and NO OBLIGATION to buy. CLICK HERE for offer
Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.
Sorry, offer available in the US only. International orders, click here.
Subscribe now!
Recently Popular
| Linux HOWTO: Video Editing Magic with ffmpeg | Jul-23-08 |
| Man vs. Myth: Greg Kroah-Hartman and the Kernel Driver Project | Jul-21-08 |
| Google Gadgets for Linux | Jul-21-08 |
| Building a Call Center with LTSP and Soft Phones | Aug-25-05 |
| Review: HP 2133 Mini-Note | Jul-16-08 |
| Boot with GRUB | May-01-01 |
Featured Videos
Non-linear video editing tools are great, but they're not always the best tool for the job. This is where a powerful tool like ffmpeg becomes useful. This tutorial by Elliot Isaacson covers the basics of transcoding video, as well as more advanced tricks like creating animations, screen captures, and slow motion effects.
Shawn Powers reviews the HP Mini-Note portable computer.
Thanks to our sponsor: Silicon Mechanics
Silicon Mechanics is a leading manufacturer of rackmount servers, storage, and high performance computing hardware. The best warranty offerings available are backed by experts dedicated to customer satisfaction.
From the Magazine
August 2008, #172
There's nuttin like a Cool Project to give you some relief from the summer heat, so get out your parka cuz we got a bunch of em. First up is the BUG, not a bug, The BUG. It's got a GPS, camera and more, in a hand-sized package that's user programmable. The BUG does everything. It's both a floor wax and a dessert topping. Get one now. Need a software version of a Swiss Army knife? Take a look at Billix, and don't leave home without it. Then, chew on this one, an X server on a Gumstix device driving an E-Ink display. Need more storage? How about 16 Terabytes? Can do.
And, of course, we have the usual cast of characters: Marcel, Reuven, Dave, Kyle, Doc, plus the new kid on the block Shawn Powers. But it doesn't stop there: build a MythTV box on a budget, build your own GIS system, set up the tools to monitor your enterprise and more. Finally, remember The War of the Worlds? Now you can play too.


Delicious
Digg
Reddit
Newsvine
Technorati







Problems with Failing Over
On November 21st, 2007 Carlos Hernán (not verified) says:
Hi, i have problems with Failing Over script in heart beat, someone can tell me how implement promote.sh in heartbeat.
Thanks...
Excellent Article
On October 29th, 2007 Anonymous (not verified) says:
Good job at breaking things down and keeping it simple. This review of Slony-I should be link of their main website for people looking for a quick yet structured walk-through.
Keep up the good work ;)
SERVERS ON DIFFERENT MACHINE
On May 30th, 2007 dilsha (not verified) says:
HI....
nice tutorial
In my case i have both master and slave on different machine so is it necessary to execute the script files and to start the slon engine on both the systems?
plz help me
Thanks
In case of two database server
On April 23rd, 2007 Mahendra (not verified) says:
In case of two database server. Do I need to run both cluster_setup.sh and subscribe.sh tobe executed on both database?
Also artical specifies that in node defination host=FQDN. Can we give IP address in this case?
Thanks
Hi guys!
On March 29th, 2007 Giacomo Orizzonte (not verified) says:
exelent tutorial Ludovic!
i try follow all steps, and i thnik all is ok but dont see in my replicating base any rows from master, i see two process slon (well is two diferent pc's) i run cluster.sh and suscribe.sh scripts in two pc's, whit out errors. and i see the mensajes for slon
( i open in console see this --------------------
2007-03-29 12:02:18 CDT CONFIG storeListen: li_origin=2 li_receiver=1 li_provider=2
2007-03-29 12:07:01 CDT DEBUG1 cleanupThread: 0.001 seconds for cleanupEvent()
2007-03-29 12:07:01 CDT DEBUG1 cleanupThread: 0.004 seconds for delete logs
-----------------------------
)
but dont start the sync process.
do you have any idea?
thanks very much.
thanx
On February 8th, 2007 Somu (not verified) says:
The article is well descriptive, and helped a lot for me to setup the secondary server for postgres
THANX A LOT
Do i Have to install SLONY in the Master and in the Slave server
On January 15th, 2007 Icebergdelphi (not verified) says:
Hi, i am from Chiapas Mexico, actually i have a Postgre Database installed in a CENTOS Linux Distro(In my Principal Office), but i need to stablish a 7 Servers with the same database and with CENTOS in every Branch of my company, my big question is: Do i have to Install SLONY in every Server in every branch of my company with the same configuration of the Master database?
I readed Database Replication with Slony-I, but i never saw something about the slaves configurations.
Tanx
and sorry 4 My English
Hiber Tadeo Moreno (IcebergDelphi)
timestamp problem
On June 27th, 2006 umar (not verified) says:
I've tried
cluster_setup.shabove and get this error:ERROR: invalid input syntax for type timestamp: "Tue Jun 27 14:06:04.735468 2006 WIT"I've tried suggestion from http://gborg.postgresql.org/project/slony1/bugs/bugupdate.php?1300 but still didn't work. Any help will be appreciated.
Notes:
I change the U variable from
postgresto my account with superuser privilege and add P variable for password. The default postgres user keep being asked for password. I'm using postgres and slony1 from debian package.reply timestamp problem
On July 13th, 2006 irawan (not verified) says:
find this line in postgresql.conf and change
timezone=UTC
or
timezone=GMT
Automated failover Considered Harmful
On August 5th, 2005 Andrew Hammond (not verified) says:
The decision to failover (including accepting the potential loss of committed transactions) is not usually one you want software making for you. Typically in the event of a failure, it would make more sense to attempt to get the failed master up and stumbling along enough to perform a move set (the controlled, non-lossy alternative to failover). Failover is a last resort with async replication systems.
Also, it's worth at least mentioning the _excellent_ slony mailing list:
http://gborg.postgresql.org/mailman/listinfo/slony1
I followed the exact same ste
On August 4th, 2005 Steven (not verified) says:
I followed the exact same steps but when I was running 'subscribe.sh' I got the error message below:
:4: NOTICE: subscribeSet:: set 1 has no tables - risk of problems - see bug 1226
:4: NOTICE: http://gborg.postgresql.org/project/slony1/bugs/bugupdate.php?1226
Please help me solve this. Thanks.
It's normal. it's a NOTICE me
On August 17th, 2005 Anonymous (not verified) says:
It's normal. it's a NOTICE message, not ERROR message. Everything work like a charm (like async charm ;D)
U rock!!
On December 14th, 2006 dhenz (not verified) says:
I followed your stuff and I got it to work just one time!! u rock thanx!!
idle process
On June 12th, 2007 toone (not verified) says:
We are running slony to check for changes every 10 seconds but we observe that process are overlapping and tends to become idle thus using up all of 4GB of memory what can we do?