In the last post we had a look at how EDB BART integrates into EDB PEM, so backups and restores can be managed centrally from the PEM console. Another tool that comes with an EDB subscription is EDB EFM (EDB Failover Manager). We again will not cover how that tool works or how you need to set it up, but rather focus on how EFM integrates into PEM. What I expect from the integration of EFM into PEM is the following:

  • On overview of the cluster status, it’s members and roles
  • Initiating a controlled switchover
  • Integrated health-checks and notifications

So, let’s see what PEM brings on top of the command line when it comes to EFM.

As said above we’ll not look at how EFM needs to be configured, this is already done here and the current status of the fail-over cluster is this:

[root@edb-as12-1 efm-3.10]$ /usr/edb/efm-3.10/bin/efm cluster-status efm
Cluster Status: efm

        Agent Type  Address              Agent  DB       VIP
        -----------------------------------------------------------------------
        Standby     10.0.1.114           UP     UP       10.0.1.233
        Witness     10.0.1.197           UP     N/A      10.0.1.233
        Master      10.0.1.82            UP     UP       10.0.1.233*

Allowed node host list:
        10.0.1.197 edb-bart edb-as12-1 edb-as12-2

Membership coordinator: 10.0.1.197

Standby priority host list:
        10.0.1.114

Promote Status:

        DB Type     Address              WAL Received LSN   WAL Replayed LSN   Info
        ---------------------------------------------------------------------------
        Master      10.0.1.82                               0/29000060         
        Standby     10.0.1.114           0/29000060         0/29000060         

        Standby database(s) in sync with master. It is safe to promote.

There is one primary database and one replica which is open for read only connections. In addition there is a witness EFM agent running on the BART host (this is the same host we used in the last post for storing the backups). The witness node is required if you want to have automatic fail-over as you need at least three agents to build the quorum. In addition EFM is configured to bring up a virtual IP address so the applications have a single point to connect to.

PEM comes with a dashboard for streaming replication so that should show us the current status of our fail-over cluster:

Not much we can see here. By default PEM does not know anything about EFM even if it is already configured and running (this might be something that can be improved in future versions of PEM. I do not see a reason why the PEM agent should not be able to auto-discover a configured EFM cluster). To make PEM aware of the EFM configuration we need to provide some basic information in the “Properties” dialog of each instance in the cluster (servers need to be disconnected for being able to adjust the properties):

Once we have the EFM details configured for each database server, the streaming replication dashboard will display the details of the EFM cluster:

What you can see here is basically the same information you would see on the command line when you ask for the cluster status. What’s a bit strange is the mix of IP addresses and host names, but maybe that’s just cosmetic. What’s also strange is, that the “Status information” and “Xlog Information” columns are empty but they will only be populated if there is an error, if all is fine they are empty. The same is true for “Cluster status message”: It will only contain some information if something is not fine.

What I would like to see on this screen in addition:

  • The content of the efm.properties for each node
  • A message indicating if it is currently safe to promote a replica (as you can see that in the command line output)
  • The possibility to initiate a controlled switch-over
  • Lag information in a time unit

According to the documentation there should be the possibility to initiate a fail-over from the management menu, but this option does not exist (or I am not able to find it):

From an alerting perspective there are no predefined alerts for EFM:

If you want to have that you need to create your own alerts and probes.

Conclusion: The integration of EFM into PEM is not as far as it is for the BART integration. You get basic statistics once you configured the EFM settings but not more. Would be great if the points raised in this blog post would be integrated in a future release.