Policy-managed databases appeared with Oracle 11g R2 but are not very common yet. I never had the opportunity to administer these kind of databases. Since Oracle now presents policy-databases as the default installation mode and as a best practice, I wanted to know more about this feature which fully automatizes the databases repartition within an Oracle cluster. In this post, I will describe how to implement and handle policy-managed databases, and in a future post I will detail new Oracle 12c improvements.

Understanding the server-pool concept

The first and maybe the major concept to understand is the server-pool. A server pool is a logical group of servers within the cluster, sharing resources for databases and/or services. The following schema shows two server pools, having two nodes each.

rac_pmdb

A node can be part of several server pools, or be attached to a single named server pool, depending on the configuration. However, a node can run on only one server pool at a time.
With server pools, the database instance is not assigned anymore to a named server, but to a server pool, and the cluster will manage the instance placement over the nodes.

Scalability and availability of applications (databases, services…) within a server-pool can be dynamically defined using MIN_SIZE, MAX_SIZE and MAXIMUM parameters. These parameters are defined at server pool level:

  • MIN_SIZE represents the minimal number of nodes which should compose the server-pool. If the number of nodes falls below the MIN_SIZE value, Oracle moves servers from other server-pools – if available – to satisfy this parameter.
  • MAX_SIZE represents the maximal number of nodes available in a server-pool. MAX_SIZE can be considered as a kind of target, or as a “must have”. Oracle will satisfy this parameter only if all server-pool MIN_SIZE are reached first.
  • IMPORTANCE controls the availability of an application. Its value can range from 0 (not important) to 1000. The server pool with the higher importance value will have resources allocated in priority: new available nodes are assigned to the most important server-pool first, and nodes can be de-assigned from a less important server-pool to satisfy the MIN_SIZE of the most important server-pool.

As you may expect, the IMPORTANCE parameter will be used to guarantee availability of business critical services, to the detriment of lower important applications, such as development.

Creating a server-pool

In this example, I will use a cluster composed of four identic nodes. Two are up and running, the two remaining have cluster services not running at this time.

First, I will create two server-pools. One for production databases, and one for development/test databases. As a first example, I set MIN and MAX values at 1, and IMPORTANCE value at 0, the minimum. I use the srvctl tool as oracle user.

It is possible to create a server pool with or without specifying a list of nodes. If you specify one or several nodes as part of a server pool, only these listed nodes would be able to migrate to this server pool. We speak about “Candidate server names” which is an implicitely excluding list. If you do not specify any node name, all nodes would be able to migrate to the server pool. It means that all nodes are “candidates” for this pool. You will see later that the “candidate server names” field shows the nodes which have been explicitely affected to a server pool.

Here I will not specify candidate servers, so all nodes within my cluster can migrate to any server pools.

[oracle@node01 ~]$ srvctl add srvpool -g srvtest -l 1 -u 1 -i 0
[oracle@node01 ~]$ srvctl add srvpool -g srvprod -l 1 -u 1 -i 0

 

This is the list of options used:

  • -g: name of the server pool
  • -l: minimum number of nodes
  • -u: maximum number of nodes
  • -i: importance value

We can get the status of the different server pools with the following command:

[oracle@node01 ~]$ srvctl status srvpool -detail
Server pool name: Free
Active servers count: 0
Active server names:
Server pool name: Generic
Active servers count: 0
Active server names:
Server pool name: srvprod
Active servers count: 1
Active server names: node02
NAME=node02 STATE=ONLINE
Server pool name: srvtest
Active servers count: 1
Active server names: node01
NAME=node01 STATE=ONLINE

 

As we can see, there are four server-pools: free, generic, srvtest and srvprod

  • The free server-pool contains unassigned servers. It is empty since the two available servers are already used in srvprod and srvtest pools.
  • The generic server pool contains servers running admin-managed databases. It is empty because there is no admin-managed database running in this cluster.
  • The srvprod and srvtest pools have been created above, and contain one node each due to the min/max value set to 1.

Assigning a database to a server-pool

We can assign a new database to a server-pool with dbca. In the following example, I have created the DB1 database in the existing srvprod server-pool. Note that policy managed is the default choice proposed starting with Oracle 12c.

01_policy_managed_dbca

02_policy_managed_dbca

Once the database is created, we can see that it has started on available nodes of the srvprod server-pool:

[oracle @ node01 ~]$ srvctl status database -d DB1
Instance DB1_1 is running on node node01

As for an admin managed database, the instance gets a suffix number but the “underscore” character is also used to define the name of the instance. Here DB1 gets DB1_1 instance name.

I have created a second database DB2 in the srvtest server-pool.

[oracle@node01 ~]$ srvctl status database -d DB2
Instance DB2_1 is running on node node02

 

Impact of server-pool parameters

Let’s study the impact of MIN_SIZE, MAX_SIZE and IMPORTANCE parameters. In this example, I will first change the value of these parameters for the srvprod server pool, and then I will start cluster services on the two remaining nodes.

The srvprod pool will have a MIN_SIZE of 2 and a MAX_SIZE of 3:

[oracle@node01 ~]$ srvctl modify srvpool -min 2 -max 3 -serverpool srvprod

We have now the following configuration:

[oracle@node01 ~]$ srvctl config srvpool
...
Server pool name: srvprod
Importance: 0, Min: 2, Max: 3
Category: hub
Candidate server names:
Server pool name: srvtest
Importance: 0, Min: 1, Max: 1
Category: hub
Candidate server names:

 

(“Candidate server names” is empty since I have not defined any server as part of a server pool.)

Now I start cluster services on node 03 and node 04.

[root@node03 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
 
[root@node04 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

After few minutes, I see that srvprod has now 3 nodes, satisfying the MAX_SIZE parameter. Srvtest has still 1 node, and it is conform to the MIN_SIZE and MAX_SIZE defined:

[oracle@node01 ~]$ srvctl status srvpool
...
Server pool name: srvprod
Active servers count: 3
Active server names: node01,node03,node04
NAME=node01 STATE=ONLINE
NAME=node03 STATE=ONLINE
NAME=node04 STATE=ONLINE
Server pool name: srvtest
Active servers count: 1
Active server names: node02
NAME=node02 STATE=ONLINE

The test database DB2 is running on node 02:

[oracle@node01 ~]$ srvctl status database -d DB2
Instance DB2_1 is running on node node02

 

Example 1: MIN_SIZE and MAX_SIZE

Let’s see what happens if I stop cluster services on node 02 in the srvtest pool, with DB2 instance running.

[root@node02 bin]# date
Mon Apr 28 11:49:09 CEST 2014[root@node02 bin]# ./crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node02'
CRS-2673: Attempting to stop 'ora.crsd' on 'node02'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'node02'
CRS-2673: Attempting to stop 'ora.db2.db' on 'node02'
...

 

To see what is going on, I consult the crsd.log file on the master node of the cluster.

Tip

To identify the master node within a cluster, look at the “OCR MASTER” string value with a grep command on the crsd.log on any node. You will get “I AM THE NEW OCR MASTER” on the primary node, or “NEW OCR MASTER IS N” on secondary nodes, where N is the number of the node returned by ‘olsnodes -n’ command:

[grid@node01 ~]$ cat /u00/app/12.1.0/grid/log/node01/crsd/crsd.log | grep -i 'ocr master'
...
...
2014-04-28 04:53:16.388: [  OCRMAS][2831439616]th_master:12: I AM THE NEW OCR MASTER at incar 1. Node Number 1
 
[grid@node01 ~]$ olsnodes -n
node01  1
node02  2
node03  3
node04  4

 

The crsd.log file in /u00/app/12.1.0/grid/log/node01/crsd/crsd.log shows following events:

2014-04-28 11:50:50.878: [   CRSPE][2296276736] {1:4797:769} Pool ora.srvtest [min:1][max:1][importance:0] NO SERVERS ASSIGNED cannot be a donor  for pool ora.srvtest [min:1][max:1][importance:0] NO SERVERS ASSIGNED
2014-04-28 11:50:50.878: [   CRSPE][2296276736] {1:4797:769}Server [node04] has been un-assigned from the server pool: ora.srvprod
2014-04-28 11:50:50.880: [   CRSPE][2296276736] {1:4797:769}Server [node04] has been assigned to the server pool: ora.srvtest
...
2014-04-28 11:50:50.881: [   CRSPE][2296276736] {1:4797:769} Server [node04] has changed state from [ONLINE] to [RECONFIGURING]
...
2014-04-28 11:50:50.882: [   CRSPE][2296276736] {1:4797:769} CRS-2673: Attempting to stop 'ora.db1.db' on 'node04'
...
2014-04-28 11:51:15.708: [   CRSPE][2296276736] {1:4797:769} CRS-2677: Stop of 'ora.db1.db' on 'node04' succeeded
2014-04-28 11:51:15.708: [   CRSPE][2296276736] {1:4797:769} Updated state details for server node04 from [STOPPING RESOURCES] to : []
2014-04-28 11:51:15.708: [   CRSPE][2296276736] {1:4797:769} Server [node04] has changed state from [RECONFIGURING] to [ONLINE]
...
2014-04-28 11:51:15.718: [   CRSPE][2296276736] {1:4797:769} CRS-2672: Attempting to start 'ora.db2.db' on 'node04'
...
2014-04-28 11:51:54.749: [   CRSPE][2296276736] {1:4797:769} RI [ora.db2.db 1 1] new internal state: [STABLE] old value: [STARTING]
2014-04-28 11:51:54.749: [   CRSPE][2296276736] {1:4797:769} RI [ora.db2.db 1 1] new external state [ONLINE] old value: [OFFLINE] on node04 label = [Open]
2014-04-28 11:51:54.749: [   CRSPE][2296276736] {1:4797:769} Set LAST_SERVER to node04 for [ora.db2.db 1 1]
2014-04-28 11:51:54.750: [   CRSPE][2296276736] {1:4797:769} Set State Details to [Open] from [ Instance Shutdown] for [ora.db2.db 1 1]
2014-04-28 11:51:54.750: [   CRSPE][2296276736] {1:4797:769} CRS-2676: Start of 'ora.db2.db' on 'node04' succeeded

 

Node 04 has been deassigned from srvprod server pool, and has been assigned to srvtest serverpool. We can see that the instance of DB1 running on node04 has been shut down, and then the DB2 instance has started. We can confirm that with srvctl commands:

[oracle@node01 crsd]$ srvctl status database -d DB2
Instance DB2_1 is running on node node04
 
[oracle@node01 ~]$ srvctl status srvpool
...
Server pool name: srvprod
Active servers count: 2
Active server names: node01,node03
NAME=node01 STATE=ONLINE
NAME=node03 STATE=ONLINE
Server pool name: srvtest
Active servers count: 1
Active server names: node04
NAME=node04 STATE=ONLINE

What can we infer of that? Oracle has taken one node from the srvprod server pool, because there were three servers in this pool (MAX_SIZE), while the mandatory number of server (MIN_SIZE) is 2. MIN_SIZE value remains satisfied for both srvprod and srvtest pools and both databases DB1 and DB2 are accessible.

Example 2: IMPORTANCE parameter

To demonstrate how this parameter works, I will edit properties for the srvprod server pool and set an IMPORTANCE of 1. I will then stop cluster services on node 03, which is part of srvprod server pool.

First, I set an IMPORTANCE value of 1 for the srvprod server pool:

[oracle@node01 ~]$ srvctl modify srvpool -importance 1 -serverpool srvprod
[oracle@node01 ~]$ srvctl config srvpool -serverpool srvprod
Server pool name: srvprod
Importance: 1, Min: 2, Max: 3
Category: hub
Candidate server names:

Now, I stop cluster services on node 03. The number of remaining nodes in the srvprod server pool will fall under the MIN_SIZE value, with 1 server remaining.

[root@node03 bin]# ./crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node03'
CRS-2673: Attempting to stop 'ora.crsd' on 'node03'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'node03'
...
CRS-4133: Oracle High Availability Services has been stopped.

In crsd.log file, that is what we can see:

2014-04-28 06:27:29.222: [   CRSPE][2296276736] {1:4797:910}Server [node04] has been un-assigned from the server pool: ora.srvtest
2014-04-28 06:27:29.223: [   CRSPE][2296276736] {1:4797:910}Server [node04] has been assigned to the server pool: ora.srvprod
...
2014-04-28 06:27:29.224: [   CRSPE][2296276736] {1:4797:910} Created alert : (:CRSPE00165:) :  Server Pool ora.srvtest has fallen below minimum size
...
2014-04-28 06:27:29.224: [   CRSPE][2296276736] {1:4797:910} Server [node04] has changed state from [ONLINE] to [RECONFIGURING]
2014-04-28 06:27:29.224: [   CRSPE][2296276736] {1:4797:910} Evaluating stop of [ora.db2.db] on [node04] with parameters: Options:0x1;Identity: root;Reason:system;
...
2014-04-28 06:27:29.226: [   CRSPE][2296276736] {1:4797:910} CRS-2673: Attempting to stop 'ora.db2.db' on 'node04'
...
2014-04-28 06:28:00.352: [   CRSPE][2296276736] {1:4797:910} CRS-2677: Stop of 'ora.db2.db' on 'node04' succeeded
2014-04-28 06:28:00.352: [   CRSPE][2296276736] {1:4797:910} Updated state details for server node04 from [STOPPING RESOURCES] to : []
2014-04-28 06:28:00.352: [   CRSPE][2296276736] {1:4797:910} Server [node04] has changed state from [RECONFIGURING] to [ONLINE]
...
2014-04-28 06:28:00.360: [   CRSPE][2296276736] {1:4797:910} CRS-2672: Attempting to start 'ora.db1.db' on 'node04'
...
2014-04-28 06:28:25.563: [   CRSPE][2296276736] {1:4797:910} CRS-2676: Start of 'ora.db1.db' on 'node04' succeeded

Oracle has moved node 04 from srvtest server pool to srvprod server pool:

[oracle@node01 ~]$ srvctl status srvpool
...
Server pool name: srvprod
Active servers count: 2
Active server names: node01,node04
NAME=node01 STATE=ONLINE
NAME=node04 STATE=ONLINE
Server pool name: srvtest
Active servers count: 0
Active server names:
 
[oracle@node01 ~]$ srvctl status database -d DB2
Instance DB2_1 is running on node node01
Instance DB2_2 is running on node node04
 
[oracle@node01 ~]$ srvctl status database -d DB2
Database is not running.

 

As we can see, there is no server available on the srvtest server pool. No DB2 instance is running and the application is no more accessible. However, the MIN_SIZE is still satisfied for the srvprod server pool: two nodes are running and two database instances are accessible.

Because of the srvprod IMPORTANCE parameter greater than srvtest value, Oracle has prefered to keep service available for production, even if it means that test pool becomes orphaned…

Conclusion

With two examples, I have shown how MIN_SIZE, MAX_SIZE and IMPORTANCE parameters can impact the application availability within a policy-managed database cluster. It shows how much it is important to define these parameters according to your business rules, because it really helps to keep available resources for a critical business.
In a next post, I will show how we can use these parameters to dynamically adapt the available resources to the workload of an application.