During the preparation of my presentation for the pgconf.eu I ran into one big issue. I had to stop my cluster to add a new node. That was not the way I wanted to archive this. I want a high availability solution, that can be scaled up without any outage. Due to a little hint during the pgconf.eu I was able to find a solution. In this post I will show the manually scale up, without using a playbook.

Starting position

We start with a 3 node patroni cluster which can be created using this blog post.
Now we want to add a fourth node to the existing etcd and Patroni cluster. In case you also need a playbook to install a forth node, check out my GitHub repository.

Scale up the etcd cluster

This step is only needed, when you want to scale up your etcd cluster as well. To scale up a Patroni cluster it is not necessary to scale up etcd cluster. You can, of course, scale up Patroni without adding more etcd cluster members. But maybe someone also needs to scale up his etcd cluster and searches for a solution. If not, just jump to the next step.

Be sure the etcd and patroni service are not started on the forth node.

postgres@patroni4:/home/postgres/ [PG1] systemctl status etcd
● etcd.service - dbi services etcd service
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
postgres@patroni4:/home/postgres/ [PG1] systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
postgres@patroni4:/home/postgres/ [PG1]

Make the following adjustments in the etcd.conf of the 4th node.

postgres@patroni4:/home/postgres/ [PG1] cat /u01/app/postgres/local/dmk/etc/etcd.conf
name: patroni4
data-dir: /u02/pgdata/etcd
initial-advertise-peer-urls: http://192.168.22.114:2380
listen-peer-urls: http://192.168.22.114:2380
listen-client-urls: http://192.168.22.114:2379,http://localhost:2379
advertise-client-urls: http://192.168.22.114:2379
initial-cluster-state: 'existing'
initial-cluster: patroni1=http://192.168.22.111:2380,patroni2=http://192.168.22.112:2380,patroni3=http://192.168.22.113:2380,patroni4=http://192.168.22.114:2380

Next add the new etcd member to the existing etcd cluster. You can execute that on every existing member of the cluster.

postgres@patroni1:/home/postgres/ [PG1] etcdctl member add patroni4 http://192.168.22.114:2380
Added member named patroni4 with ID dd9fab8349b3cfc to cluster

ETCD_NAME="patroni4"
ETCD_INITIAL_CLUSTER="patroni4=http://192.168.22.114:2380,patroni1=http://192.168.22.111:2380,patroni2=http://192.168.22.112:2380,patroni3=http://192.168.22.113:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"

Now you can start the etcd service on the 4th node.

postgres@patroni4:/home/postgres/ [PG1] sudo systemctl start etcd
postgres@patroni4:/home/postgres/ [PG1] systemctl status etcd
● etcd.service - dbi services etcd service
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-10-17 16:39:16 CEST; 9s ago
 Main PID: 8239 (etcd)
   CGroup: /system.slice/etcd.service
           └─8239 /u01/app/postgres/local/dmk/bin/etcd --config-file /u01/app/postgres/local/dmk/etc/etcd.conf
postgres@patroni4:/home/postgres/ [PG1]

And after a short check, we can see, that Node 4 is added to the existing cluster

postgres@patroni4:/home/postgres/ [PG1] etcdctl member list
dd9fab8349b3cfc: name=patroni4 peerURLs=http://192.168.22.114:2380 clientURLs=http://192.168.22.114:2379 isLeader=false
16e1dca5ee237693: name=patroni1 peerURLs=http://192.168.22.111:2380 clientURLs=http://192.168.22.111:2379 isLeader=false
28a43bb36c801ed4: name=patroni2 peerURLs=http://192.168.22.112:2380 clientURLs=http://192.168.22.112:2379 isLeader=false
5ba7b55764fad76e: name=patroni3 peerURLs=http://192.168.22.113:2380 clientURLs=http://192.168.22.113:2379 isLeader=true

Scale up Patroni

Scale up the Patroni cluster is also really easy.
Adjust the host entry in the patroni.yml on the new node.

postgres@patroni4:/home/postgres/ [PG1] cat /u01/app/postgres/local/dmk/etc/patroni.yml | grep hosts
  hosts: 192.168.22.111:2379,192.168.22.112:2379,192.168.22.113:2379,192.168.22.114:2379

Afterwards, start the Patroni service.

postgres@patroni4:/home/postgres/ [PG1] sudo systemctl start patroni
postgres@patroni4:/home/postgres/ [PG1] systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-10-17 17:03:19 CEST; 5s ago
  Process: 8476 ExecStartPre=/usr/bin/sudo /bin/chown postgres /dev/watchdog (code=exited, status=0/SUCCESS)
  Process: 8468 ExecStartPre=/usr/bin/sudo /sbin/modprobe softdog (code=exited, status=0/SUCCESS)
 Main PID: 8482 (patroni)
   CGroup: /system.slice/patroni.service
           ├─8482 /usr/bin/python2 /u01/app/postgres/local/dmk/bin/patroni /u01/app/postgres/local/dmk/etc/patroni.yml
           ├─8500 /u01/app/postgres/product/11/db_5/bin/postgres -D /u02/pgdata/11/PG1/ --config-file=/u02/pgdata/11/PG1/postgresql.conf --listen_addresses=192.168.22.114 --max_worker_processes=8 --max_locks_per_transact...
           ├─8502 postgres: PG1: logger
           ├─8503 postgres: PG1: startup   waiting for 000000020000000000000006
           ├─8504 postgres: PG1: checkpointer
           ├─8505 postgres: PG1: background writer
           ├─8506 postgres: PG1: stats collector
           ├─8507 postgres: PG1: walreceiver
           └─8513 postgres: PG1: postgres postgres 192.168.22.114(48882) idle

To be sure, everything runs correctly, check the status of the Patroni cluster

postgres@patroni4:/home/postgres/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 |        | running |  2 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  2 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 | Leader | running |  2 |       0.0 |
|   PG1   | patroni4 | 192.168.22.114 |        | running |  2 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+

Conclusion

Using the playbooks had one failure. The entry for host in the patroni.yml is only checking localhost. When starting the fourth node, Patroni is not looking for all the other hosts, it is just looking for its own availability. This works fine in an initial cluster, but not when you want to extended one.
And: Always keep in mind, you need an uneven number of members for an etcd cluster, don’t add only a forth etcd node.