Some days ago EnterpriseDB released a new version of its EDB Failover Manager which brings one feature that really sounds great: “Controlled switchover and switchback for easier maintenance and disaster recovery tests”. This is exactly what you want when you are used to operate Oracle DataGuard. Switching back and forward as you like without caring much about the old master. The old master shall just be converted to a standby which follows the new master automatically. This post is about upgrading EFM from version 2.0 to 2.1.

As I still have the environment available which was used for describing the maintenance scenarios with EDB Failover Manager (Maintenance scenarios with EDB Failover Manager (1) – Standby node, Maintenance scenarios with EDB Failover Manager (2) – Primary node and Maintenance scenarios with EDB Failover Manager (3) – Witness node) I will use the same environment to upgrade to the new release. Lets start …

This is the current status of my failover cluster:

[root@edbbart ~]$ /usr/edb-efm/bin/efm cluster-status efm 
Cluster Status: efm
Automatic failover is disabled.

	Agent Type  Address              Agent  DB       Info
	--------------------------------------------------------------
	Master      192.168.22.245       UP     UP        
	Witness     192.168.22.244       UP     N/A       
	Standby     192.168.22.243       UP     UP        

Allowed node host list:
	192.168.22.244 192.168.22.245 192.168.22.243

Standby priority host list:
	192.168.22.243

Promote Status:

	DB Type     Address              XLog Loc         Info
	--------------------------------------------------------------
	Master      192.168.22.245       0/3B01C5E0       
	Standby     192.168.22.243       0/3B01C5E0       

	Standby database(s) in sync with master. It is safe to promote.
[root@edbbart ~]$ 

Obviously you have to download the new version to begin the upgrade. Once the rpm is available on all nodes simply install it on all the nodes:

[root@edbppas tmp]$ yum localinstall efm21-2.1.0-1.rhel7.x86_64.rpm

EFM 2.1 comes with an utility command that helps in upgrading a cluster. You should invoke it on each node:

[root@edbbart tmp]$ /usr/efm-2.1/bin/efm upgrade-conf efm
Processing efm.properties file.
Setting new property node.timeout to 40 (sec) based on existing timeout 5000 (ms) and max tries 8.

Processing efm.nodes file.

Upgrade of files is finished. Please ensure that the new file permissions match those of the template files before starting EFM.
The db.service.name property should be set before starting a non-witness agent.

This created a new configuration file in the new directory under /etc which was created when the new version was installed:

[root@edbbart tmp]$ ls /etc/efm-2.1
efm.nodes  efm.nodes.in  efm.properties  efm.properties.in

All the values from the old EFM cluster should be there in the new configuration files:

[root@edbbart efm-2.1]$ pwd
/etc/efm-2.1
[root@edbbart efm-2.1]$ cat efm.properties | grep daniel
user.email=daniel.westermann...

Before going further check the new configuration parameters for EFM 2.1, which are:

auto.allow.hosts
auto.resume.period
db.service.name
jvm.options
minimum.standbys
node.timeout
promotable
recovery.check.period
script.notification
script.resumed

I’ll leave everything as it was before for now. Notice that a new service got created:

[root@edbppas efm-2.1]$ systemctl list-unit-files | grep efm
efm-2.0.service                             enabled 
efm-2.1.service                             disabled

Lets try to shutdown the old service on all nodes and then start the new one. Step 1 (on all nodes):

[root@edbppas efm-2.1]$ systemctl stop efm-2.0.service
[root@edbppas efm-2.1]$ systemctl disable efm-2.0.service
rm '/etc/systemd/system/multi-user.target.wants/efm-2.0.service'

Then enable the new service:

[root@edbppas efm-2.1]$ systemctl enable efm-2.1.service
ln -s '/usr/lib/systemd/system/efm-2.1.service' '/etc/systemd/system/multi-user.target.wants/efm-2.1.service'
[root@edbppas efm-2.1]$ systemctl list-unit-files | grep efm
efm-2.0.service                             disabled
efm-2.1.service                             enabled 

Make sure your efm.nodes file contains all the nodes which make up the cluster, in my case:

[root@edbppas efm-2.1]$ cat efm.nodes
# List of node address:port combinations separated by whitespace.
# The list should include at least the membership coordinator's address.
192.168.22.243:9998 192.168.22.244:9998 192.168.22.245:9998

Lets try to start the new service on the witness node first:

[root@edbbart efm-2.1]$ systemctl start efm-2.1.service
[root@edbbart efm-2.1]$ /usr/edb-efm/bin/efm cluster-status efm 
Cluster Status: efm
VIP: 192.168.22.250
Automatic failover is disabled.

	Agent Type  Address              Agent  DB       Info
	--------------------------------------------------------------
	Witness     192.168.22.244       UP     N/A       

Allowed node host list:
	192.168.22.244

Membership coordinator: 192.168.22.244

Standby priority host list:
	(List is empty.)

Promote Status:

Did not find XLog location for any nodes.

Looks good. Are we really running the new version?

[root@edbbart efm-2.1]$ /usr/edb-efm/bin/efm -v
Failover Manager, version 2.1.0

Looks fine as well. Time to add the other nodes:

[root@edbbart efm-2.1]$ /usr/edb-efm/bin/efm add-node efm 192.168.22.243
add-node signal sent to local agent.
[root@edbbart efm-2.1]$ /usr/edb-efm/bin/efm add-node efm 192.168.22.245
add-node signal sent to local agent.
[root@edbbart efm-2.1]$ /usr/edb-efm/bin/efm cluster-status efm 
Cluster Status: efm
VIP: 192.168.22.250
Automatic failover is disabled.

	Agent Type  Address              Agent  DB       Info
	--------------------------------------------------------------
	Witness     192.168.22.244       UP     N/A       

Allowed node host list:
	192.168.22.244 192.168.22.243

Membership coordinator: 192.168.22.244

Standby priority host list:
	(List is empty.)

Promote Status:

Did not find XLog location for any nodes.

Proceed on the master:

[root@ppasstandby efm-2.1]$ systemctl start efm-2.1.service
[root@ppasstandby efm-2.1]$ systemctl status efm-2.1.service
efm-2.1.service - EnterpriseDB Failover Manager 2.1
   Loaded: loaded (/usr/lib/systemd/system/efm-2.1.service; enabled)
   Active: active (running) since Thu 2016-09-08 12:04:11 CEST; 25s ago
  Process: 4020 ExecStart=/bin/bash -c /usr/efm-2.1/bin/runefm.sh start ${CLUSTER} (code=exited, status=0/SUCCESS)
 Main PID: 4075 (java)
   CGroup: /system.slice/efm-2.1.service
           └─4075 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.77-0.b03.el7_2.x86_64/jre/bin/java -cp /usr/e...

Sep 08 12:04:07 ppasstandby systemd[1]: Starting EnterpriseDB Failover Manager 2.1...
Sep 08 12:04:08 ppasstandby sudo[4087]: efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/efm-... efm
Sep 08 12:04:08 ppasstandby sudo[4098]: efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/efm-... efm
Sep 08 12:04:08 ppasstandby sudo[4114]: efm : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/usr/... efm
Sep 08 12:04:08 ppasstandby sudo[4125]: efm : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/usr/... efm
Sep 08 12:04:10 ppasstandby sudo[4165]: efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/efm-...9998
Sep 08 12:04:10 ppasstandby sudo[4176]: efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/efm-...4075
Sep 08 12:04:11 ppasstandby systemd[1]: Started EnterpriseDB Failover Manager 2.1.
Hint: Some lines were ellipsized, use -l to show in full.

And then continue on the standby:

[root@edbppas efm-2.1]$ systemctl start efm-2.1.service
[root@edbppas efm-2.1]$ systemctl status efm-2.1.service
efm-2.1.service - EnterpriseDB Failover Manager 2.1
   Loaded: loaded (/usr/lib/systemd/system/efm-2.1.service; enabled)
   Active: active (running) since Thu 2016-09-08 12:05:28 CEST; 3s ago
  Process: 3820 ExecStart=/bin/bash -c /usr/efm-2.1/bin/runefm.sh start ${CLUSTER} (code=exited, status=0/SUCCESS)
 Main PID: 3875 (java)
   CGroup: /system.slice/efm-2.1.service
           └─3875 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.77-0.b03.el7_2.x86_64/jre/bin/jav...

Sep 08 12:05:24 edbppas systemd[1]: Starting EnterpriseDB Failover Manager 2.1...
Sep 08 12:05:25 edbppas sudo[3887]: efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/u...efm
Sep 08 12:05:25 edbppas sudo[3898]: efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/u...efm
Sep 08 12:05:25 edbppas sudo[3914]: efm : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAN...efm
Sep 08 12:05:25 edbppas sudo[3925]: efm : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAN...efm
Sep 08 12:05:25 edbppas sudo[3945]: efm : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAN...efm
Sep 08 12:05:28 edbppas sudo[3981]: efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/u...998
Sep 08 12:05:28 edbppas sudo[3994]: efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/u...875
Sep 08 12:05:28 edbppas systemd[1]: Started EnterpriseDB Failover Manager 2.1.
Hint: Some lines were ellipsized, use -l to show in full.

What is the cluster status now?:

[root@edbbart efm-2.1]$ /usr/edb-efm/bin/efm cluster-status efm 
Cluster Status: efm
VIP: 192.168.22.250
Automatic failover is disabled.

	Agent Type  Address              Agent  DB       Info
	--------------------------------------------------------------
	Master      192.168.22.245       UP     UP        
	Witness     192.168.22.244       UP     N/A       
	Standby     192.168.22.243       UP     UP        

Allowed node host list:
	192.168.22.244 192.168.22.243 192.168.22.245

Membership coordinator: 192.168.22.244

Standby priority host list:
	192.168.22.243

Promote Status:

	DB Type     Address              XLog Loc         Info
	--------------------------------------------------------------
	Master      192.168.22.245       0/3B01C7A0       
	Standby     192.168.22.243       0/3B01C7A0       

	Standby database(s) in sync with master. It is safe to promote.

Cool. Back in operation on the new release. Quite easy.

PS: Remember to re-point your symlinks in /etc and /usr if you created symlinks for easy of use.