Infrastructure at your Service

Daniel Westermann

Vertically scale your PostgreSQL infrastructure with pgpool – 2 – Automatic failover and reconfiguration

In the first post of this little series we did the basic setup of pgpool and configured the watchdog. So, as of now, pgpool uses a VIP which is active on one of the pgpool nodes and failovers to the other node in case the node where the VIP currently runs on goes down for any reason. This provides the very same address for the application or clients to connect and eliminates the single point of failure that you have when there is only one pgpool instance running. What is still missing is a mechanism on how we can automatically failover to a promoted standby instance in case the PostgreSQL master goes down. We even want to be able to reconfigure the old master as new standby that follows the new master automatically. This is what this post is about …

Lets take a look at the picture again:

pgpool-architecture

What happens when we stop the standby instance? Currently both instances are in status “up” when you ask pgpool (notice that I connect to the 192.168.22.38 address which is the VIP that is managed by pgpool):

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | primary | 0          | true              | 0
 1       | 192.168.22.35 | 5432 | up     | 0.500000  | standby | 0          | false             | 0
(2 rows)

Lets create a new connection first to see what happens to existing connections:

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 postgres
psql (9.6.2 dbi services build)
Type "help" for help.

postgres= select now();
              now              
-------------------------------
 2017-03-24 10:53:16.077351+01
(1 row)


postgres=#

Shutdown the standby instance:

postgres@pgpool2:/home/postgres/ [PG1] psql -c "select pg_is_in_recovery()" postgres
 pg_is_in_recovery 
-------------------
 t
(1 row)

postgres@pgpool2:/home/postgres/ [PG1] pg_ctl -D /u02/pgdata/PG1/ stop -m fast
waiting for server to shut down.... done
server stopped

What happened to our connection?

postgres=# select now();
              now              
-------------------------------
 2017-03-24 10:54:46.802984+01
(1 row)

Nothing, we can just keep using it. What is pgpool telling us about our instances?

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | primary | 4          | true              | 0
 1       | 192.168.22.35 | 5432 | down   | 0.500000  | standby | 0          | false             | 0
(2 rows)

As expected the standby node is now down. This is the easy case, just start the standby again and you are fine:

postgres@pgpool2:/home/postgres/ [PG1] pg_ctl -D /u02/pgdata/PG1/ start -m fast
server starting
postgres@pgpool2:/home/postgres/ [PG1] psql -c "select pg_is_in_recovery()" postgres
 pg_is_in_recovery 
-------------------
 t
(1 row)

… and pgpool should detect this as well:

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | primary | 5          | true              | 0
 1       | 192.168.22.35 | 5432 | down   | 0.500000  | standby | 0          | false             | 0
(2 rows)

No? What we need to do is to attach the node again:

[postgres@centos7_pgpool_m1 ~]$ pcp_attach_node -n 1
Password: 
pcp_attach_node -- Command Successful
[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | primary | 5          | true              | 0
 1       | 192.168.22.35 | 5432 | up     | 0.500000  | standby | 0          | false             | 0

Now comes the not so easy part: What happens if the primary instance goes down? This is where the failover_command specified in pgpool.conf comes into the game:

[postgres@centos7_pgpool_m1 etc]$ cat pgpool.conf | grep failover_command
failover_command = '/home/postgres/failover.sh %d "%h" %p %D %m %M "%H" %P'

When pgpool detects that the master is gone the failover_command will be executed. This is a very simple failover script (located on the pgpool hosts). The parameters are passed into the script by pgpool automatically:

#!/bin/sh -x
# Execute command by failover.
# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %H = new master node host name
#                  %P = old primary node id
#                  %% = '%' character
failed_node_id=$1
failed_host_name=$2
failed_port=$3
failed_db_cluster=$4
new_master_id=$5
old_master_id=$6
new_master_host_name=$7
old_primary_node_id=$8

if [ $failed_node_id = $old_primary_node_id ];then	# master failed
    ssh -T postgres@$new_master_host_name /home/postgres/promote.sh
fi 

The promote.sh script is very simple as well and must be available on all PostgreSQL nodes:

#!/bin/bash
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
export PATH PGDATA
pg_ctl promote -D ${PGDATA} >> /var/tmp/failover.log
psql -c "select * from pg_create_physical_replication_slot('standby1')" postgres >> /var/tmp/failover.log

So once we shutdown the master this gets triggered and the standby is promoted. How can we re-attach the old master as a new standby? The magic is in here:

[postgres@centos7_pgpool_m1 ~]$ grep recovery_1st_stage_command /u01/app/postgres/product/pgpool-II/etc/pgpool.conf
recovery_1st_stage_command = 'recover.sh'

In the pgpool.conf file a script is specified which is called when a recovery is triggered. This script needs to be in PGDATA, otherwise it will not work. It looks like this (on both nodes, of course):

postgres@pgpool1:/u02/pgdata/PG1/ [PG1] cat $PGDATA/recover.sh 
#!/bin/sh
set -x
master_node_host_name=`hostname`
master_db_cluster=$1
recovery_node_hostname=$2
recovery_db_cluster=$3
tmp=/tmp/mytemp$$
trap "rm -f $tmp" 0 1 2 3 15

PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"

export PATH PGDATA

# make sure the old master is down
ssh -T $recovery_node_hostname /home/postgres/stop_instance.sh
# rewind the old master
ssh -T ${recovery_node_hostname} /home/postgres/rewind.sh ${master_node_host_name}

cat > $tmp <<EOF
standby_mode = 'on'
primary_conninfo = 'host=$master_node_host_name user=postgres'
primary_slot_name = 'standby1'
recovery_target_timeline = 'latest'
trigger_file='/u02/pgdata/PG1/failover_trigger'
EOF

scp $tmp $recovery_node_hostname:$recovery_db_cluster/recovery.conf

ssh -T ${recovery_node_hostname} /home/postgres/start_instance.sh

The “set -x” on the top is quite important because this script is execute from inside PostgreSQL and you’ll see the output in the PostgreSQL logfile. The scripts executed within the recovery.sh script here are:

postgres@pgpool1:/home/postgres/ [PG1] cat stop_instance.sh 
#!/bin/bash
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
pg_ctl -D $PGDATA stop -m immediate 

postgres@pgpool1:/home/postgres/ [PG1] cat rewind.sh 
#!/bin/bash
master_node_host_name=$1
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
pg_rewind --target-pgdata=${PGDATA} --source-server="port=5432 user=postgres dbname=postgres host=${master_node_host_name}" >> /var/tmp/recover.log

postgres@pgpool1:/home/postgres/ [PG1] cat start_instance.sh 
#!/bin/bash
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
pg_ctl -w -s -D $PGDATA start 2>/dev/null 1>/dev/null < /dev/null &

Having this in place we can rebuild the old master with:

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | down   | 0.500000  | standby | 0          | false             | 0
 1       | 192.168.22.35 | 5432 | up     | 0.500000  | primary | 0          | true              | 0

[postgres@centos7_pgpool_m1 ~]$ pcp_recovery_node -n 0 -w
pcp_recovery_node -- Command Successful

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
 
---------+---------------+------+--------+-----------+---------+------------+-------------------+------------------
-
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | standby | 0          | false             | 0
 1       | 192.168.22.35 | 5432 | up     | 0.500000  | primary | 0          | true              | 0

The output in the PostgreSQL logfile of the master looks then similar to this:

++ hostname
+ master_node_host_name=pgpool2
+ master_db_cluster=/u02/pgdata/PG1
+ recovery_node_hostname=192.168.22.34
+ recovery_db_cluster=/u02/pgdata/PG1
+ tmp=/tmp/mytemp10901
+ trap 'rm -f /tmp/mytemp10901' 0 1 2 3 15
+ PGDATA=/u02/pgdata/PG1
+ PATH=/u01/app/postgres/product/96/db_2/bin/:/u01/app/postgres/product/96/db_2bin:/u01/app/postgres/product/96/db_2/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/postgres/.local/bin:/home/postgres/bin:/sbin:/usr/edb-bart/bin:/usr/edb-efm/bin:/u01/app/postgres/product/pgpool-II/bin:/u01/app/postgres/local/dmk/bin
+ export PATH PGDATA
+ ssh -T 192.168.22.34 /home/postgres/stop_instance.sh
pg_ctl: PID file "/u02/pgdata/PG1/postmaster.pid" does not exist
Is server running?
+ ssh -T 192.168.22.34 /home/postgres/rewind.sh pgpool2
+ cat
+ scp /tmp/mytemp10901 pgpool1:/u02/pgdata/PG1/recovery.conf
+ ssh 192.168.22.34 /home/postgres/start_instance.sh
+ psql -c 'select '\''done'\''' postgres
 ?column? 
----------
 done
(1 row)

Time: 0.140 ms
+ rm -f /tmp/mytemp10901

I have to admit that I needed a small hack: At the end of the recovery pgpool tries to execute $PGDATA/pgpool_remote_start and I do not want to do it that way so I just created this file on both nodes:

postgres@pgpool1:/u02/pgdata/PG1/ [PG1] cat $PGDATA/pgpool_remote_start
exit 0

In the next post we’ll add a second standby system and then teach pgpool how a remaining standby can be told to follow a new master automatically. We’ll need adjust the scripts here to be more dynamic as well then.

2 Comments

Leave a Reply

Daniel Westermann
Daniel Westermann

Senior Consultant and Technology Leader Open Infrastructure