Infrastructure at your Service

Michael Hein

Patching a virtualized ODA to patch 12.2.1.4.0

This article describes patching a virtualized Oracle Database Appliance (ODA) containing only an ODA_BASE virtual machine.

Do this patching first on test machines because it can not be guaranteed that all causes of failures of single VM ODAs are covered in this article. I got the experience that precheck for ODA patches does not detect some failure conditions which may lead to an unusuable ODA.

Overview:
Patch first to 12.1.2.12.0
After that patch to 12.2.1.4.0

Procedure for both patches:

Preparation:

Apply all files of the patch to repository on all nodes as user root:

oakcli unpack -package /directory_name/file_name

Verify patch and parts to be patched on all servers:

[root@xx1 ~]# oakcli update -patch 12.2.1.4.0 --verify
INFO: 2018-09-24 08:32:52: Reading the metadata file now...
Component Name Installed Version Proposed Patch Version
--------------- ------------------ -----------------
Controller_INT 4.650.00-7176 Up-to-date
Controller_EXT 13.00.00.00 Up-to-date
Expander 0291 0306
SSD_SHARED {
[ c1d20,c1d21,c1d22, A29A Up-to-date
c1d23 ] [ c1d0,c1d1,c1d2,c1d A29A Up-to-date
3,c1d4,c1d5,c1d6,c1d
7,c1d8,c1d9,c1d10,c1
d11,c1d12,c1d13,c1d1
4,c1d15,c1d16,c1d17,
c1d18,c1d19 ] }
SSD_LOCAL 0R3Q Up-to-date
ILOM 3.2.9.23 r116695 4.0.2.26.a r123797
BIOS 38070200 38100300
IPMI 1.8.12.4 Up-to-date
HMP 2.3.5.2.8 2.4.1.0.11
OAK 12.1.2.12.0 12.2.1.4.0
OL 6.8 6.9
OVM 3.4.3 3.4.4
GI_HOME 12.1.0.2.170814(2660 12.2.0.1.180417(2767
9783,26609945) 4384,27464465)
DB_HOME {
[ OraDb12102_home1 ] 12.1.0.2.170814(2660 12.1.0.2.180417(2733
9783,26609945) 8029,27338020)
[ OraDb11204_home2 ] 11.2.0.4.170418(2473 11.2.0.4.180417(2733
2075,23054319) 8049,27441052)
}

Validate the whole ODA (not during peak load):

oakcli validate -a

Show versions of all installed components (example is after patching):

[root@xx1 ~]# oakcli show version -detail
Reading the metadata. It takes a while...
System Version Component Name Installed Version Supported Version
-------------- --------------- ------------------ -----------------
12.2.1.4.0
Controller_INT 4.650.00-7176 Up-to-date
Controller_EXT 13.00.00.00 Up-to-date
Expander 0306 Up-to-date
SSD_SHARED {
[ c1d20,c1d21,c1d22, A29A Up-to-date
c1d23 ] [ c1d0,c1d1,c1d2,c1d A29A Up-to-date
3,c1d4,c1d5,c1d6,c1d
7,c1d8,c1d9,c1d10,c1
d11,c1d12,c1d13,c1d1
4,c1d15,c1d16,c1d17,
c1d18,c1d19 ] }
SSD_LOCAL 0R3Q Up-to-date
ILOM 4.0.2.26.a r123797 Up-to-date
BIOS 38100300 Up-to-date
IPMI 1.8.12.4 Up-to-date
HMP 2.4.1.0.11 Up-to-date
OAK 12.2.1.4.0 Up-to-date
OL 6.9 Up-to-date
OVM 3.4.4 Up-to-date
GI_HOME 12.2.0.1.180417(2767 Up-to-date
4384,27464465)
DB_HOME 11.2.0.4.170418(2473 11.2.0.4.180417(2733
2075,23054319) 8049,27441052)

To dry run of ospatch (does not work for any other components than ospatch):

[root@xx1 ~]# oakcli validate -c ospatch -ver 12.2.1.4.0
INFO: Validating the OS patch for the version 12.2.1.4.0
INFO: 2018-09-25 08:34:28: Performing a dry run for OS patching
INFO: 2018-09-25 08:34:52: There are no conflicts. OS upgrade could be successful

All packages which are mentioned as incompatible must be removed before patching. Also somebody who is able to install and configure compatible versions of these packages properly after patching should be available. Also compatible versions of these packages should be prepared beforehand.

Before applying patch:
In dataguard installations, set state to APPLY-OFF for all standby databases
Disable all jobs which use Grid Infrastructure or databases
Set all ACFS replications to “pause”.
Unmount all ACFS filesystems
Stop all agents on all ODA nodes
Remove all resources from Grid Infrastructure which depend on ACFS filesystems (srvctl remove)
These resources can be determined with:

crsctl stat res -dependency | grep -i acfs

Remove all packages which were found incompatible to patch.

Note:
Scripts of both patches cannot unmount ACFS filesystems (at least filesystems mounted with registry) and usage of Grid Infrastructure files by mounted ACFS filesystems causes both patches to fail. Check scripts of both patches seem not to check for this condition. In Grid Infrastructure all resources on which other resources have dependencies must exist, otherwise their configuration must be saved and the resources must be removed from GI.

Use UNIX tool screen for applying patch because any network interruption causes patch to fail.

Patching:
Only server and storage should be patched with oakcli script, databases should be patched manually. In / filesystem at least 10 GB, in /u01 at least 15 GB available disk space must exist.

All commands have to be executed on primary ODA node as user root. The http server error at end of server patching can be ignored.


[root@xx1 ~]# screen
[root@xx1 ~]# oakcli update -patch 12.2.1.4.0 --server
INFO: DB, ASM, Clusterware may be stopped during the patch if required
INFO: Both Nodes may get rebooted automatically during the patch if required
Do you want to continue: [Y/N]?: Y
INFO: User has confirmed for the reboot
INFO: Patch bundle must be unpacked on the second Node also before applying the patch
Did you unpack the patch bundle on the second Node? : [Y/N]? : Y
INFO: All the VMs except the ODABASE will be shutdown forcefully if needed
Do you want to continue : [Y/N]? : Y
INFO: Running pre-install scripts
INFO: Running prepatching on node 0
INFO: Running prepatching on node 1
INFO: Completed pre-install scripts
INFO: Patching server component (rolling)
INFO: Stopping VMs, repos and OAKD on both nodes...
INFO: Stopped Oakd
...
INFO: Patching the server on node: xx2
INFO: it may take upto 60 minutes. Please wait
INFO: Infrastructure patching summary on node: xx1
INFO: Infrastructure patching summary on node: xx2
SUCCESS: 2018-09-25 09:42:24: Successfully upgraded the HMP
SUCCESS: 2018-09-25 09:42:24: Successfully updated the OAK
SUCCESS: 2018-09-25 09:42:24: Successfully updated the JDK
INFO: 2018-09-25 09:42:24: IPMI is already upgraded
SUCCESS: 2018-09-25 09:42:24: Successfully upgraded the OS
SUCCESS: 2018-09-25 09:42:24: Successfully updated the device OVM
SUCCESS: 2018-09-25 09:42:24: Successfully upgraded the HMP on Dom0
INFO: 2018-09-25 09:42:24: Local storage patching summary on Dom0...
SUCCESS: 2018-09-25 09:42:24: Successfully upgraded the local storage
SUCCESS: 2018-09-25 09:42:24: Successfully updated the device Ilom
SUCCESS: 2018-09-25 09:42:24: Successfully updated the device BIOS
INFO: 2018-09-25 09:42:24: Some of the components patched on node
INFO: 2018-09-25 09:42:24: require node reboot. Rebooting the node
INFO: 2018-09-25 09:42:24: rebooting xx2 via /tmp/dom0reboot...
..........
INFO: 2018-09-25 09:48:03: xx2 is rebooting...
INFO: 2018-09-25 09:48:03: Waiting for xx2 to reboot...
........
INFO: 2018-09-25 09:55:24: xx2 has rebooted...
INFO: 2018-09-25 09:55:24: Waiting for processes on xx2 to start...
..
INFO: Patching server component on node: xx1
INFO: 2018-09-25 09:59:31: Patching ODABASE Server Components (including Grid software)
INFO: 2018-09-25 09:59:31: ------------------Patching HMP-------------------------
SUCCESS: 2018-09-25 10:00:26: Successfully upgraded the HMP
INFO: 2018-09-25 10:00:26: creating /usr/lib64/sun-ssm symlink
INFO: 2018-09-25 10:00:27: ----------------------Patching OAK---------------------
SUCCESS: 2018-09-25 10:00:59: Successfully upgraded OAK
INFO: 2018-09-25 10:01:02: ----------------------Patching JDK---------------------
SUCCESS: 2018-09-25 10:01:12: Successfully upgraded JDK
INFO: 2018-09-25 10:01:12: ----------------------Patching IPMI---------------------
INFO: 2018-09-25 10:01:12: IPMI is already upgraded or running with the latest version
INFO: 2018-09-25 10:01:13: ------------------Patching OS-------------------------
INFO: 2018-09-25 10:01:36: Removed kernel-uek-firmware-4.1.12-61.44.1.el6uek.noarch
INFO: 2018-09-25 10:01:52: Removed kernel-uek-4.1.12-61.44.1.el6uek.x86_64
INFO: 2018-09-25 10:02:03: Clusterware is running on local node
INFO: 2018-09-25 10:02:03: Attempting to stop clusterware and its resources locally
SUCCESS: 2018-09-25 10:03:22: Successfully stopped the clusterware on local node
SUCCESS: 2018-09-25 10:07:36: Successfully upgraded the OS
INFO: 2018-09-25 10:07:40: ------------------Patching Grid-------------------------
INFO: 2018-09-25 10:07:45: Checking for available free space on /, /tmp, /u01
INFO: 2018-09-25 10:07:50: Attempting to upgrade grid.
INFO: 2018-09-25 10:07:50: Executing /opt/oracle/oak/pkgrepos/System/12.2.1.4.0/bin/GridUpgrade.pl...
SUCCESS: 2018-09-25 10:55:07: Grid software has been updated.
INFO: 2018-09-25 10:55:07: Patching DOM0 Server Components
INFO: 2018-09-25 10:55:07: Attempting to patch OS on Dom0...
INFO: 2018-09-25 10:55:16: Clusterware is running on local node
INFO: 2018-09-25 10:55:16: Attempting to stop clusterware and its resources locally
SUCCESS: 2018-09-25 10:56:45: Successfully stopped the clusterware on local node
SUCCESS: 2018-09-25 11:02:19: Successfully updated the device OVM to 3.4.4
INFO: 2018-09-25 11:02:19: Attempting to patch the HMP on Dom0...
SUCCESS: 2018-09-25 11:02:26: Successfully updated the device HMP to the version 2.4.1.0.11 on Dom0
INFO: 2018-09-25 11:02:26: Attempting to patch the IPMI on Dom0...
INFO: 2018-09-25 11:02:27: Successfully updated the IPMI on Dom0
INFO: 2018-09-25 11:02:30: Attempting to patch the local storage on Dom0...
INFO: 2018-09-25 11:02:30: Stopping clusterware on local node...
INFO: 2018-09-25 11:02:37: Disk : c0d0 is already running with MS4SC2JH2ORA480G 0R3Q
INFO: 2018-09-25 11:02:38: Disk : c0d1 is already running with MS4SC2JH2ORA480G 0R3Q
INFO: 2018-09-25 11:02:40: Controller : c0 is already running with 0x005d 4.650.00-7176
INFO: 2018-09-25 11:02:41: Attempting to patch the ILOM on Dom0...
SUCCESS: 2018-09-25 11:27:49: Successfully updated the device Ilom to 4.0.2.26.a r123797
SUCCESS: 2018-09-25 11:27:49: Successfully updated the device BIOS to 38100300
INFO: Infrastructure patching summary on node: xxxx1
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the HMP
SUCCESS: 2018-09-25 11:27:54: Successfully updated the OAK
SUCCESS: 2018-09-25 11:27:54: Successfully updated the JDK
INFO: 2018-09-25 11:27:54: IPMI is already upgraded
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the OS
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded GI
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device OVM
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the HMP on Dom0
INFO: 2018-09-25 11:27:54: Local storage patching summary on Dom0...
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the local storage
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device Ilom
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device BIOS
INFO: Infrastructure patching summary on node: xxxx2
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the HMP
SUCCESS: 2018-09-25 11:27:54: Successfully updated the OAK
SUCCESS: 2018-09-25 11:27:54: Successfully updated the JDK
INFO: 2018-09-25 11:27:54: IPMI is already upgraded
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the OS
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device OVM
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the HMP on Dom0
INFO: 2018-09-25 11:27:54: Local storage patching summary on Dom0...
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the local storage
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device Ilom
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device BIOS
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded GI
INFO: Running post-install scripts
INFO: Running postpatch on node 1...
INFO: Running postpatch on node 0...
...
...
INFO: Started Oakd
INFO: 2018-09-25 11:32:26: Some of the components patched on node
INFO: 2018-09-25 11:32:26: require node reboot. Rebooting the node
INFO: Rebooting Dom0 on node 0
INFO: 2018-09-25 11:32:26: Running /tmp/dom0reboot on node 0
INFO: 2018-09-25 11:33:10: Clusterware is running on local node
INFO: 2018-09-25 11:33:10: Attempting to stop clusterware and its resources locally
SUCCESS: 2018-09-25 11:35:52: Successfully stopped the clusterware on local node
INFO: 2018-09-25 11:38:54: RPC::XML::Client::send_request: HTTP server error: read timeout
[root@xx1 ~]#
Broadcast message from root@xx1
(unknown) at 11:39 ...
The system is going down for power off NOW!

[root@xx1 ~]# oakcli update -patch 12.2.1.4.0 --storage
INFO: DB, ASM, Clusterware may be stopped during the patch if required
INFO: Both Nodes may get rebooted automatically during the patch if required
Do you want to continue: [Y/N]?: Y
INFO: User has confirmed for the reboot
INFO: Running pre-install scripts
INFO: Running prepatching on node 0
INFO: Running prepatching on node 1
INFO: Completed pre-install scripts
INFO: Shared Storage components need to be patched
INFO: Stopping OAKD on both nodes...
INFO: Stopped Oakd
INFO: Attempting to shutdown clusterware (if required)..
INFO: 2018-09-25 12:07:13: Clusterware is running on one or more nodes of the cluster
INFO: 2018-09-25 12:07:13: Attempting to stop clusterware and its resources across the cluster
SUCCESS: 2018-09-25 12:07:59: Successfully stopped the clusterware
INFO: Patching storage on node xx2
INFO: Patching storage on node xx1
INFO: 2018-09-25 12:08:23: ----------------Patching Storage-------------------
INFO: 2018-09-25 12:08:23: ....................Patching Shared SSDs...............
INFO: 2018-09-25 12:08:23: Disk : d0 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:23: Disk : d1 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:23: Disk : d2 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:24: Disk : d3 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:24: Disk : d4 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:24: Disk : d5 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:25: Disk : d6 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:25: Disk : d7 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:25: Disk : d8 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:26: Disk : d9 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:26: Disk : d10 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:26: Disk : d11 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:27: Disk : d12 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:27: Disk : d13 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:27: Disk : d14 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:28: Disk : d15 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:28: Disk : d16 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:28: Disk : d17 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:29: Disk : d18 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:29: Disk : d19 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:30: Disk : d20 is already running with : HSCAC2DA6SUN200G A29A
INFO: 2018-09-25 12:08:30: Disk : d21 is already running with : HSCAC2DA6SUN200G A29A
INFO: 2018-09-25 12:08:30: Disk : d22 is already running with : HSCAC2DA6SUN200G A29A
INFO: 2018-09-25 12:08:31: Disk : d23 is already running with : HSCAC2DA6SUN200G A29A
INFO: 2018-09-25 12:08:31: ....................Patching Shared HDDs...............
INFO: 2018-09-25 12:08:31: ....................Patching Expanders...............
INFO: 2018-09-25 12:08:31: Updating the Expander : c0x0 with the Firmware : DE3-24C 0306
SUCCESS: 2018-09-25 12:09:24: Successfully updated the Firmware on Expander : c0x0 to DE3-24C 0306
INFO: 2018-09-25 12:09:24: Updating the Expander : c1x0 with the Firmware : DE3-24C 0306
SUCCESS: 2018-09-25 12:10:16: Successfully updated the Firmware on Expander : c1x0 to DE3-24C 0306
INFO: 2018-09-25 12:10:16: ..............Patching Shared Controllers...............
INFO: 2018-09-25 12:10:16: Controller : c0 is already running with : 0x0097 13.00.00.00
INFO: 2018-09-25 12:10:17: Controller : c1 is already running with : 0x0097 13.00.00.00
INFO: 2018-09-25 12:10:17: ------------ Completed Storage Patching------------
INFO: 2018-09-25 12:10:17: Completed patching of shared_storage
INFO: Patching completed for component Storage
INFO: Running post-install scripts
INFO: Running postpatch on node 1...
INFO: Running postpatch on node 0...
INFO: 2018-09-25 12:10:28: Some of the components patched on node
INFO: 2018-09-25 12:10:28: require node reboot. Rebooting the node
INFO: 2018-09-25 12:10:28: Running /tmp/pending_actions on node 1
INFO: Node will reboot now.
INFO: Please check reboot progress via ILOM interface
INFO: This session may appear to hang, press ENTER after reboot
INFO: 2018-09-25 12:12:53: Rebooting Dom1 on node 0
INFO: Running /tmp/pending_actions on node 0
Broadcast message from oracle@xx1
(/dev/pts/0) at 12:13 ...
The system is going down for reboot NOW!

After successful patching:

Install and configure compatible versions of all previously removed packages
Mount all ACFS filesystems
Recreate all deleted Grid Infrastructure resources and start them
Reenable all jobs disabled before
Resume all ACFS replications
Set state of all dataguard standby databases to APPLY-ON
Check ACFS replications
Check dataguard status
Check whether all works as before

3 Comments

  • Mike McM says:

    In your applying patch checklist, you have:
    Remove all resources from Grid Infrastructure which depend on ACFS filesystems (srvctl remove)
    Have you experienced previous patch failures because of this issue and is this now part of your standard patch checklist?
    I only have REDO, RECO, DATA datastores for acfs volumes but the dependency list seems quite huge. Do you script this or do it manually for removal? I guess you shutdown databases and grid beforehand? Can you give an example of which srvctl remove commands and add commands would look like?
    I seem to be at .500 success rate with ODA patching and would like to improve on this.
    Thanks for your help. Lots of great posts on your blog.

    Mike

  • Mike McM says:

    Do you ever patch with –local and do one node at a time? I try to keep the databases accessible on one node but I’m wondering if this is the cause for patch bundle failures and safer to take a full outage.

    Thanks

    • Michael Hein says:

      Hello Mike,

      if you use acfs with registry you cannot keep databases open during upgrade, because this means, that they are still in Grid Infrastructure, which causes patching to fail. Option “–local” did not work in my attempts.

      Hope this helps,

      Michael

Leave a Reply

Michael Hein
Michael Hein