When I attended PGIBZ 2019 earlier this year, I talked with Dimitri about pg_auto_failover and I promised to have a look at it. Well, almost half a year later and after we’ve met again at pgconf.eu it is time to actually do that. You probably already know that citudata was acquired by Microsoft earlier this year and that Microsoft seems to be committed to open source since a few years. pg_auto_failover is one of the projects they contribute back to the PostgreSQL community. This will be a multi-blog series and in this very first post it is all about getting it up and running. In a following post we will then look at failover and switchover scenarios.
As usual, when you need auto failover you need at least three nodes and pg_auto_failover is no exception to that. The following graphic is stolen from the pg_auto_failover github page:
We have one PostgreSQL master, one PostgreSQL replica and in addition a monitoring host. In may case that maps to:
pg-af1.ti.dbi-services.com | master | 192.168.22.70 | pg-af2.ti.dbi-services.com | replica | 192.168.22.71 | pg-af3.ti.dbi-services.com | monitor/cluster management | 192.168.22.72 |
All of these nodes run CentOS 8 and I will be going from source code as that gives most flexibility. As pg_auto_failover depends on PostgreSQL (of course) the first step is to install PostgreSQL on all three nodes (PostgreSQL 12 in this setup). If you need further information on how to do that you can e.g. check here. Basically these steps have been executed on all the three nodes (given that the postgres user already exists and sudo is configured):
[postgres@pg-af1 ~]$ sudo dnf install -y gcc openldap-devel python36-devel readline-devel redhat-lsb bison flex perl-ExtUtils-Embed zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel openssh-clients bzip2 net-tools wget unzip sysstat xorg-x11-xauth systemd-devel bash-completion python36 policycoreutils-python-utils make git [postgres@pg-af1 ~]$ wget https://ftp.postgresql.org/pub/source/v12.0/postgresql-12.0.tar.bz2 [postgres@pg-af1 ~]$ tar -axf postgresql-12.0.tar.bz2 [postgres@pg-af1 ~]$ cd postgresql-12.0 [postgres@pg-af1 postgresql-12.0]$ sudo mkdir -p /u01 /u02 [postgres@pg-af1 postgresql-12.0]$ sudo chown postgres:postgres /u01 /u02 [postgres@pg-af1 postgresql-12.0]$ PGHOME=/u01/app/postgres/product/12/db_0/ [postgres@pg-af1 postgresql-12.0]$ SEGSIZE=2 [postgres@pg-af1 postgresql-12.0]$ BLOCKSIZE=8 [postgres@pg-af1 postgresql-12.0]$ WALSEGSIZE=64 [postgres@pg-af1 postgresql-12.0]$ ./configure --prefix=${PGHOME} \ > --exec-prefix=${PGHOME} \ > --bindir=${PGHOME}/bin \ > --libdir=${PGHOME}/lib \ > --sysconfdir=${PGHOME}/etc \ > --includedir=${PGHOME}/include \ > --datarootdir=${PGHOME}/share \ > --datadir=${PGHOME}/share \ > --with-pgport=5432 \ > --with-perl \ > --with-python \ > --with-openssl \ > --with-pam \ > --with-ldap \ > --with-libxml \ > --with-libxslt \ > --with-segsize=${SEGSIZE} \ > --with-blocksize=${BLOCKSIZE} \ > --with-systemd \ > --with-extra-version=" dbi services build" [postgres@pg-af1 postgresql-12.0]$ make all [postgres@pg-af1 postgresql-12.0]$ make install [postgres@pg-af1 postgresql-12.0]$ cd contrib [postgres@pg-af1 contrib]$ make install [postgres@pg-af1 contrib]$ cd ../.. [postgres@pg-af1 ~]$ rm -rf postgresql*
We will go for an installation from source code of pg_auto_failover as well (again, on all three nodes):
postgres@pg-af1:/home/postgres/ [pg120] git clone https://github.com/citusdata/pg_auto_failover.git postgres@pg-af1:/home/postgres/ [pg120] cd pg_auto_failover/ postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] make postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] make install postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] cd .. postgres@pg-af1:/home/postgres/ [pg120] rm -rf pg_auto_failover/
That’s it, quite easy. What I like especially is, that there are no dependencies on python or any other libraries except for PostgreSQL. What the installation gives us is basically pg_autoctl:
postgres@pg-af1:/home/postgres/ [pg120] pg_autoctl --help pg_autoctl: pg_auto_failover control tools and service usage: pg_autoctl [ --verbose --quiet ] Available commands: pg_autoctl + create Create a pg_auto_failover node, or formation + drop Drop a pg_auto_failover node, or formation + config Manages the pg_autoctl configuration + show Show pg_auto_failover information + enable Enable a feature on a formation + disable Disable a feature on a formation run Run the pg_autoctl service (monitor or keeper) stop signal the pg_autoctl service for it to stop reload signal the pg_autoctl for it to reload its configuration help print help message version print pg_autoctl version
The first step in setting up the cluster is to initialize the monitoring node:
postgres@pg-af3:/home/postgres/ [pg120] pg_autoctl create --help pg_autoctl create: Create a pg_auto_failover node, or formation Available commands: pg_autoctl create monitor Initialize a pg_auto_failover monitor node postgres Initialize a pg_auto_failover standalone postgres node formation Create a new formation on the pg_auto_failover monitor postgres@pg-af3:/home/postgres/ [pg120] sudo mkdir -p /u02/pgdata postgres@pg-af3:/home/postgres/ [pg120] sudo chown postgres:postgres /u02/pgdata postgres@pg-af3:/home/postgres/ [pg120] unset PGDATABASE postgres@pg-af3:/home/postgres/ [] pg_autoctl create monitor --pgdata /u02/pgdata/PG12/af INFO Initialising a PostgreSQL cluster at "/u02/pgdata/PG12/af" INFO /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/PG12/af --options "-p 5432" --options "-h *" --waitstart INFO Granting connection privileges on 192.168.22.0/24 INFO Your pg_auto_failover monitor instance is now ready on port 5432. INFO pg_auto_failover monitor is ready at postgres://autoctl_node@pg-af3:5432/pg_auto_failover INFO Monitor has been succesfully initialized.
Once that succeeds you’ll a new PostgreSQL instance running and pg_auto_failover PostgreSQL background worker processes:
postgres@pg-af3:/home/postgres/ [af] ps -ef | grep "postgres:" postgres 5958 5955 0 14:15 ? 00:00:00 postgres: checkpointer postgres 5959 5955 0 14:15 ? 00:00:00 postgres: background writer postgres 5960 5955 0 14:15 ? 00:00:00 postgres: walwriter postgres 5961 5955 0 14:15 ? 00:00:00 postgres: autovacuum launcher postgres 5962 5955 0 14:15 ? 00:00:00 postgres: stats collector postgres 5963 5955 0 14:15 ? 00:00:00 postgres: pg_auto_failover monitor postgres 5964 5955 0 14:15 ? 00:00:00 postgres: logical replication launcher postgres 5965 5955 0 14:15 ? 00:00:00 postgres: pg_auto_failover monitor worker postgres 5966 5955 0 14:15 ? 00:00:00 postgres: pg_auto_failover monitor worker
The initialization of the monitor node also created a new database and two roles:
postgres@pg-af3:/home/postgres/ [af] psql postgres psql (12.0 dbi services build) Type "help" for help. postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges ------------------+----------+----------+-------------+-------------+----------------------- pg_auto_failover | autoctl | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres postgres=# \du List of roles Role name | Attributes | Member of --------------+------------------------------------------------------------+----------- autoctl | | {} autoctl_node | | {} postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
What we got in the new database is the pgautofailover extension:
pg_auto_failover=# \dx List of installed extensions Name | Version | Schema | Description ----------------+---------+------------+------------------------------ pgautofailover | 1.0 | public | pg_auto_failover plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language (2 rows)
For our management kit to work properly a few PostgreSQL parameters will be set:
pg_auto_failover=# alter system set log_truncate_on_rotation = 'on'; ALTER SYSTEM pg_auto_failover=# alter system set log_filename = 'postgresql-%a.log'; ALTER SYSTEM pg_auto_failover=# alter system set log_rotation_age = '1440'; ALTER SYSTEM pg_auto_failover=# alter system set log_line_prefix = '%m - %l - %p - %h - %u@%d - %x'; ALTER SYSTEM pg_auto_failover=# alter system set log_directory = 'pg_log'; ALTER SYSTEM pg_auto_failover=# alter system set log_min_messages = 'WARNING'; ALTER SYSTEM pg_auto_failover=# alter system set log_autovacuum_min_duration = '60s'; ALTER SYSTEM pg_auto_failover=# alter system set log_min_error_statement = 'NOTICE'; ALTER SYSTEM pg_auto_failover=# alter system set log_min_duration_statement = '30s'; ALTER SYSTEM pg_auto_failover=# alter system set log_checkpoints = 'on'; ALTER SYSTEM pg_auto_failover=# alter system set log_statement = 'ddl'; ALTER SYSTEM pg_auto_failover=# alter system set log_lock_waits = 'on'; ALTER SYSTEM pg_auto_failover=# alter system set log_temp_files = '0'; ALTER SYSTEM pg_auto_failover=# alter system set log_timezone = 'Europe/Zurich'; ALTER SYSTEM pg_auto_failover=# alter system set log_connections=on; ALTER SYSTEM pg_auto_failover=# alter system set log_disconnections=on; ALTER SYSTEM pg_auto_failover=# alter system set log_duration=on; ALTER SYSTEM pg_auto_failover=# select pg_reload_conf(); pg_reload_conf ---------------- t (1 row)
What we need for the other nodes is the connection string to the monitoring node:
postgres@pg-af3:/home/postgres/ [af] pg_autoctl show uri postgres://autoctl_node@pg-af3:5432/pg_auto_failover
Once we have that we can proceed with creating the master instance on the first host:
postgres@pg-af1:/home/postgres/ [pg120] unset PGDATABASE postgres@pg-af1:/home/postgres/ [] sudo mkdir /u02/pgdata postgres@pg-af1:/home/postgres/ [] sudo chown postgres:postgres /u02/pgdata postgres@pg-af1:/home/postgres/ [] pg_autoctl create postgres --pgdata /u02/pgdata/12/PG1 --nodename pg-af1.it.dbi-services.com --monitor postgres://autoctl_node@pg-af3:5432/pg_auto_failover INFO Found pg_ctl for PostgreSQL 12.0 at /u01/app/postgres/product/12/db_0/bin/pg_ctl INFO Registered node pg-af1.it.dbi-services.com:5432 with id 1 in formation "default", group 0. INFO Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/12/PG1/pg_autoctl.init" INFO Successfully registered as "single" to the monitor. INFO Initialising a PostgreSQL cluster at "/u02/pgdata/12/PG1" INFO Postgres is not running, starting postgres INFO /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start INFO The user "postgres" already exists, skipping. INFO CREATE DATABASE postgres; INFO The database "postgres" already exists, skipping. INFO FSM transition from "init" to "single": Start as a single node INFO Initialising postgres as a primary INFO Transition complete: current state is now "single" INFO Keeper has been succesfully initialized.
Once the master if up bring up the replica on the second node:
postgres@pg-af2:/home/postgres/ [pg120] pg_autoctl create postgres --pgdata /u02/pgdata/12/PG1 --nodename pg-af2.it.dbi-services.com --monitor postgres://autoctl_node@pg-af3:5432/pg_auto_failover 17:11:42 INFO Registered node pg-af2.it.dbi-services.com:5432 with id 2 in formation "default", group 0. 17:11:42 INFO Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/12/PG1/pg_autoctl.init" 17:11:42 INFO Successfully registered as "wait_standby" to the monitor. 17:11:42 INFO FSM transition from "init" to "wait_standby": Start following a primary 17:11:42 INFO Transition complete: current state is now "wait_standby" 17:11:47 INFO FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby 17:11:47 INFO The primary node returned by the monitor is pg-af1.it.dbi-services.com:5432 17:11:47 INFO Initialising PostgreSQL as a hot standby 17:11:47 INFO Running /u01/app/postgres/product/12/db_0/bin/pg_basebackup -w -h pg-af1.it.dbi-services.com -p 5432 --pgdata /u02/pgdata/12/backup -U pgautofailover_replicator --write-recovery-conf --max-rate 100M --wal-method=stream --slot pgautofailover_standby ... 17:11:49 INFO pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup: checkpoint completed pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1 pg_basebackup: starting background WAL receiver 0/23699 kB (0%), 0/1 tablespace (/u02/pgdata/12/backup/backup_label ) 136/23699 kB (0%), 0/1 tablespace (/u02/pgdata/12/backup/global/4184 ) 23708/23708 kB (100%), 0/1 tablespace (...data/12/backup/global/pg_control) 23708/23708 kB (100%), 1/1 tablespace pg_basebackup: write-ahead log end point: 0/2000100 pg_basebackup: waiting for background process to finish streaming ... pg_basebackup: syncing data to disk ... pg_basebackup: base backup completed 17:11:49 INFO Postgres is not running, starting postgres 17:11:49 INFO /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start 17:11:50 INFO PostgreSQL started on port 5432 17:11:50 INFO Transition complete: current state is now "catchingup" 17:11:50 INFO Keeper has been succesfully initialized.
The next step is to start the so called keeper process (this is the process which communicates with the montoring node about state changes):
postgres@pg-af1:/home/postgres/ [] pg_autoctl run --pgdata /u02/pgdata/12/PG1 INFO Managing PostgreSQL installation at "/u02/pgdata/12/PG1" INFO The version of extenstion "pgautofailover" is "1.0" on the monitor INFO pg_autoctl service is starting INFO Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0". INFO Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0". INFO Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0". INFO Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".
To integrate that into systemd:
postgres@pg-af2:/home/postgres/ [PG1] pg_autoctl show systemd 20:28:43 INFO HINT: to complete a systemd integration, run the following commands: 20:28:43 INFO pg_autoctl -q show systemd --pgdata "/u02/pgdata/12/PG1" | sudo tee /etc/systemd/system/pgautofailover.service 20:28:43 INFO sudo systemctl daemon-reload 20:28:43 INFO sudo systemctl start pgautofailover [Unit] Description = pg_auto_failover [Service] WorkingDirectory = /u02/pgdata/12/PG1 Environment = 'PGDATA=/u02/pgdata/12/PG1' User = postgres ExecStart = /u01/app/postgres/product/12/db_0/bin/pg_autoctl run Restart = always StartLimitBurst = 0 [Install] WantedBy = multi-user.target postgres@pg-af2:/home/postgres/ [PG1] pg_autoctl -q show systemd --pgdata "/u02/pgdata/12/PG1" | sudo tee /etc/systemd/system/pgautofailover.service [Unit] Description = pg_auto_failover [Service] WorkingDirectory = /u02/pgdata/12/PG1 Environment = 'PGDATA=/u02/pgdata/12/PG1' User = postgres ExecStart = /u01/app/postgres/product/12/db_0/bin/pg_autoctl run Restart = always StartLimitBurst = 0 [Install] WantedBy = multi-user.target postgres@pg-af2:/home/postgres/ [PG1] systemctl list-unit-files | grep pgauto pgautofailover.service disabled 20:30:57 postgres@pg-af2:/home/postgres/ [PG1] sudo systemctl enable pgautofailover.service Created symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.
If you are on CentOS/Red Hat 8 you will also need this as otherwise the service will not start:
postgres@pg-af1:/u01/app/postgres/local/dmk/ [PG1] sudo semanage fcontext -a -t bin_t /u01/app/postgres/product/12/db_0/bin/pg_autoctl postgres@pg-af1:/u01/app/postgres/local/dmk/ [PG1] restorecon -v /u01/app/postgres/product/12/db_0/bin/pg_autoctl
After rebooting all the nodes (to confirm that the systemd service is working as expected) the state of the cluster reports one primary and a secondary/replica as expected:
postgres@pg-af3:/home/postgres/ [af] pg_autoctl show state Name | Port | Group | Node | Current State | Assigned State ---------------------------+--------+-------+-------+-------------------+------------------ pg-af1.it.dbi-services.com | 5432 | 0 | 1 | primary | primary pg-af2.it.dbi-services.com | 5432 | 0 | 2 | secondary | secondary
The various states are documented here.
Remember: As this is based on PostgreSQL 12 there will be no recovery.conf on the replica. The replication parameters have been added to postgresql.auto.conf automatically:
postgres@pg-af2:/u02/pgdata/12/PG1/ [PG1] cat postgresql.auto.conf # Do not edit this file manually! # It will be overwritten by the ALTER SYSTEM command. primary_conninfo = 'user=pgautofailover_replicator passfile=''/home/postgres/.pgpass'' connect_timeout=5 host=''pg-af1.it.dbi-services.com'' port=5432 sslmode=prefer sslcompression=0 gssencmode=disable target_session_attrs=any' primary_slot_name = 'pgautofailover_standby'
That’s it for the setup. Really easy and simple, I like it. In the next post we’ll have a look at controlled switch-overs and fail-over scenarios.