Infrastructure at your Service

Marc Wagner

ODA appliance creation error and cleanup.pl with option force

By September 25, 2020 Oracle No Comments

I had recently faced an interesting issue deploying ODA X8-2-HA. I would like to share my experience here hoping it could help some of you.

Error in creating the appliance

After reimaging both node0 and node1 from the ODA X8-2-HA, running the configure-firstnet and updating the repository I was going to create the appliance.
On ODA X8-2-HA the appliance creation is run from the node0. Provisioning service creation had completed in failure :
[[email protected] ODA_patch]# odacli describe-job -i 1d6d3c88-a729-4a56-813c-f6751150441c
 
Job details
----------------------------------------------------------------
ID: 1d6d3c88-a729-4a56-813c-f6751150441c
Description: Provisioning service creation
Status: Failure
Created: September 7, 2020 3:12:16 PM CEST
Message: DCS-10001:Internal error encountered: Fail to run root scripts : Check /u01/app/19.0.0.0/grid/install/root_node0_name_2020-09-07_15-41-59-761776313.log for the output of root script.
 
Task Name Start Time End Time Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Provisioning service creation September 7, 2020 3:12:24 PM CEST September 7, 2020 3:45:21 PM CEST Failure
Provisioning service creation September 7, 2020 3:12:24 PM CEST September 7, 2020 3:45:21 PM CEST Failure
network update September 7, 2020 3:12:26 PM CEST September 7, 2020 3:12:40 PM CEST Success
updating network September 7, 2020 3:12:26 PM CEST September 7, 2020 3:12:40 PM CEST Success
Setting up Network September 7, 2020 3:12:26 PM CEST September 7, 2020 3:12:26 PM CEST Success
network update September 7, 2020 3:12:40 PM CEST September 7, 2020 3:12:51 PM CEST Success
updating network September 7, 2020 3:12:41 PM CEST September 7, 2020 3:12:51 PM CEST Success
Setting up Network September 7, 2020 3:12:41 PM CEST September 7, 2020 3:12:41 PM CEST Success
OS usergroup 'asmdba'creation September 7, 2020 3:12:51 PM CEST September 7, 2020 3:12:51 PM CEST Success
OS usergroup 'asmoper'creation September 7, 2020 3:12:51 PM CEST September 7, 2020 3:12:51 PM CEST Success
OS usergroup 'asmadmin'creation September 7, 2020 3:12:51 PM CEST September 7, 2020 3:12:51 PM CEST Success
OS usergroup 'dba'creation September 7, 2020 3:12:51 PM CEST September 7, 2020 3:12:51 PM CEST Success
OS usergroup 'dbaoper'creation September 7, 2020 3:12:51 PM CEST September 7, 2020 3:12:51 PM CEST Success
OS usergroup 'oinstall'creation September 7, 2020 3:12:51 PM CEST September 7, 2020 3:12:51 PM CEST Success
OS user 'grid'creation September 7, 2020 3:12:51 PM CEST September 7, 2020 3:12:51 PM CEST Success
OS user 'oracle'creation September 7, 2020 3:12:51 PM CEST September 7, 2020 3:12:52 PM CEST Success
Default backup policy creation September 7, 2020 3:12:52 PM CEST September 7, 2020 3:12:52 PM CEST Success
Backup config metadata persist September 7, 2020 3:12:52 PM CEST September 7, 2020 3:12:52 PM CEST Success
SSH equivalance setup September 7, 2020 3:12:52 PM CEST September 7, 2020 3:12:52 PM CEST Success
Grid home creation September 7, 2020 3:13:08 PM CEST September 7, 2020 3:29:00 PM CEST Success
Creating GI home directories September 7, 2020 3:13:08 PM CEST September 7, 2020 3:13:08 PM CEST Success
Cloning Gi home September 7, 2020 3:13:08 PM CEST September 7, 2020 3:15:13 PM CEST Success
Cloning Gi home September 7, 2020 3:15:13 PM CEST September 7, 2020 3:28:57 PM CEST Success
Updating GiHome version September 7, 2020 3:28:57 PM CEST September 7, 2020 3:29:00 PM CEST Success
Updating GiHome version September 7, 2020 3:28:57 PM CEST September 7, 2020 3:29:00 PM CEST Success
Storage discovery September 7, 2020 3:29:00 PM CEST September 7, 2020 3:40:26 PM CEST Success
Grid stack creation September 7, 2020 3:40:26 PM CEST September 7, 2020 3:45:21 PM CEST Failure
Configuring GI September 7, 2020 3:40:26 PM CEST September 7, 2020 3:41:59 PM CEST Success
Running GI root scripts September 7, 2020 3:41:59 PM CEST September 7, 2020 3:45:21 PM CEST Failure

Troubleshooting the logs I found that there were errors in creating the disk group +DATA :
[[email protected] ODA_patch]# more /u01/app/19.0.0.0/grid/install/root_node0_name_2020-09-07_15-41-59-761776313.log
...
...
...
SQL*Plus: Release 19.0.0.0.0 - Production on Mon Sep 7 15:45:00 2020
Version 19.8.0.0.0
 
Copyright (c) 1982, 2020, Oracle. All rights reserved.
 
Connected to an idle instance.
 
ASM instance started
 
Total System Global Area 1137173320 bytes
Fixed Size 8905544 bytes
Variable Size 1103101952 bytes
ASM Cache 25165824 bytes
ORA-15032: not all alterations performed
ORA-15036: Disk 'AFD:HDD_E0_S03_1732452488P1' is truncated to 4671744 MB from
7941632 MB.
ORA-15036: Disk 'AFD:HDD_E0_S02_1732457496P1' is truncated to 4671744 MB from
7941632 MB.
ORA-15036: Disk 'AFD:HDD_E0_S01_1732376280P1' is truncated to 4671744 MB from
7941632 MB.
ORA-15036: Disk 'AFD:HDD_E0_S00_1733177556P1' is truncated to 4671744 MB from
7941632 MB.
 
 
create diskgroup DATA NORMAL REDUNDANCY
*
ERROR at line 1:
ORA-15018: diskgroup cannot be created
ORA-15033: disk 'AFD:HDD_E0_S03_1732452488P1' belongs to diskgroup "DATA"
ORA-15033: disk 'AFD:HDD_E0_S02_1732457496P1' belongs to diskgroup "DATA"
ORA-15033: disk 'AFD:HDD_E0_S01_1732376280P1' belongs to diskgroup "DATA"
ORA-15033: disk 'AFD:HDD_E0_S00_1733177556P1' belongs to diskgroup "DATA"
 
 
create spfile='+DATA' from pfile='/u01/app/19.0.0.0/grid/dbs/init_+ASM1.ora'
*
ERROR at line 1:
ORA-17635: failure in obtaining physical sector size for '+DATA'

And I could realized that this was due because the ODA was previously used and shelf storage disks have not been cleared.

Performing Secure Erase of Data on Storage Disks

I then decided to erase data on the storage disk.

[[email protected]_name ~]# ps -ef | grep pmon
grid 5652 1 0 15:45 ? 00:00:00 asm_pmon_+ASM1
root 57503 56032 0 15:59 pts/1 00:00:00 grep --color=auto pmon
 
[[email protected]_name ~]# odaadmcli stop oak
2020-09-07 16:02:38.815975760:[init.oak]:[Error : Operation not permitted while software upgrade is in progress ...]  
[[email protected]_name ~]# /u01/app/19.0.0.0/grid/bin/crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node0_name'
CRS-2673: Attempting to stop 'ora.asm' on 'node0_name'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'node0_name'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'node0_name'
CRS-2677: Stop of 'ora.drivers.acfs' on 'node0_name' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'node0_name' succeeded
CRS-2677: Stop of 'ora.asm' on 'node0_name' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'node0_name'
CRS-2673: Attempting to stop 'ora.evmd' on 'node0_name'
CRS-2677: Stop of 'ora.ctssd' on 'node0_name' succeeded
CRS-2677: Stop of 'ora.evmd' on 'node0_name' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'node0_name'
CRS-2677: Stop of 'ora.cssd' on 'node0_name' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'node0_name'
CRS-2673: Attempting to stop 'ora.gipcd' on 'node0_name'
CRS-2673: Attempting to stop 'ora.driver.afd' on 'node0_name'
CRS-2677: Stop of 'ora.driver.afd' on 'node0_name' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'node0_name' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'node0_name' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node0_name' has completed
CRS-4133: Oracle High Availability Services has been stopped.
 
[[email protected]_name app]# /opt/oracle/oak/bin/odaeraser.py
Please stop oakd and GI/DB applications before running this program
[[email protected]_name app]#

So it is cleared, as per the system, knowing there is an ongoing provisioning, we can not stop oak and therefore not run odaeraser python script.

Execute cleanup script

So no other choice than running the ODA cleanup script. Erase data is also an option of this script, so all good. I have been doing the needful for both the nodes.

[[email protected]_name app]# perl /opt/oracle/oak/onecmd/cleanup.pl -griduser grid -dbuser oracle -erasedata
INFO: *******************************************************************
INFO: ** Starting process to cleanup provisioned host node0_name **
INFO: *******************************************************************
WARNING: Secure Erase is an irrecoverable process. All data on the disk
WARNING: will be erased, and cannot be recovered by any means. On X3-2,
WARNING: X4-2, and X5-2 HA, the secure erase process can take more than
WARNING: 10 hours. If you need this data, then take a complete backup
WARNING: before proceeding.
Do you want to continue (yes/no) : yes
INFO: nodes will be rebooted
Do you want to continue (yes/no) : yes
INFO: /u01/app/19.0.0.0/grid/.patch_storage/31548513_Jul_10_2020_12_12_45/files/bin/crsctl.bin
...
...
...
--------------------------------------------------------------------------------
Label Filtering Path
================================================================================
HDD_E0_S00_1733177556P1 ENABLED /dev/mapper/HDD_E0_S00_1733177556p1
HDD_E0_S00_1733177556P2 ENABLED /dev/mapper/HDD_E0_S00_1733177556p2
HDD_E0_S01_1732376280P1 ENABLED /dev/mapper/HDD_E0_S01_1732376280p1
HDD_E0_S01_1732376280P2 ENABLED /dev/mapper/HDD_E0_S01_1732376280p2
HDD_E0_S02_1732457496P1 ENABLED /dev/mapper/HDD_E0_S02_1732457496p1
HDD_E0_S02_1732457496P2 ENABLED /dev/mapper/HDD_E0_S02_1732457496p2
HDD_E0_S03_1732452488P1 ENABLED /dev/mapper/HDD_E0_S03_1732452488p1
HDD_E0_S03_1732452488P2 ENABLED /dev/mapper/HDD_E0_S03_1732452488p2
HDD_E0_S04_1733016128P1 ENABLED /dev/mapper/HDD_E0_S04_1733016128p1
HDD_E0_S04_1733016128P2 ENABLED /dev/mapper/HDD_E0_S04_1733016128p2
HDD_E0_S05_1732360364P1 ENABLED /dev/mapper/HDD_E0_S05_1732360364p1
HDD_E0_S05_1732360364P2 ENABLED /dev/mapper/HDD_E0_S05_1732360364p2
HDD_E0_S06_1732360644P1 ENABLED /dev/mapper/HDD_E0_S06_1732360644p1
HDD_E0_S06_1732360644P2 ENABLED /dev/mapper/HDD_E0_S06_1732360644p2
HDD_E0_S07_1732123852P1 ENABLED /dev/mapper/HDD_E0_S07_1732123852p1
HDD_E0_S07_1732123852P2 ENABLED /dev/mapper/HDD_E0_S07_1732123852p2
HDD_E0_S08_1733366676P1 ENABLED /dev/mapper/HDD_E0_S08_1733366676p1
HDD_E0_S08_1733366676P2 ENABLED /dev/mapper/HDD_E0_S08_1733366676p2
HDD_E0_S09_1733005616P1 ENABLED /dev/mapper/HDD_E0_S09_1733005616p1
HDD_E0_S09_1733005616P2 ENABLED /dev/mapper/HDD_E0_S09_1733005616p2
HDD_E0_S10_1733304324P1 ENABLED /dev/mapper/HDD_E0_S10_1733304324p1
HDD_E0_S10_1733304324P2 ENABLED /dev/mapper/HDD_E0_S10_1733304324p2
HDD_E0_S11_1733286472P1 ENABLED /dev/mapper/HDD_E0_S11_1733286472p1
HDD_E0_S11_1733286472P2 ENABLED /dev/mapper/HDD_E0_S11_1733286472p2
HDD_E0_S12_1732005336P1 ENABLED /dev/mapper/HDD_E0_S12_1732005336p1
HDD_E0_S12_1732005336P2 ENABLED /dev/mapper/HDD_E0_S12_1732005336p2
HDD_E0_S13_1733060756P1 ENABLED /dev/mapper/HDD_E0_S13_1733060756p1
HDD_E0_S13_1733060756P2 ENABLED /dev/mapper/HDD_E0_S13_1733060756p2
HDD_E0_S14_1733060744P1 ENABLED /dev/mapper/HDD_E0_S14_1733060744p1
HDD_E0_S14_1733060744P2 ENABLED /dev/mapper/HDD_E0_S14_1733060744p2
SSD_E0_S15_2701225596P1 ENABLED /dev/mapper/SSD_E0_S15_2701225596p1
SSD_E0_S16_2701244384P1 ENABLED /dev/mapper/SSD_E0_S16_2701244384p1
SSD_E0_S17_2701254344P1 ENABLED /dev/mapper/SSD_E0_S17_2701254344p1
SSD_E0_S18_2701254500P1 ENABLED /dev/mapper/SSD_E0_S18_2701254500p1
SSD_E0_S19_2701253460P1 ENABLED /dev/mapper/SSD_E0_S19_2701253460p1
SSD_E0_S20_2617542596P1 ENABLED /dev/mapper/SSD_E0_S20_2617542596p1
SSD_E0_S21_2617528156P1 ENABLED /dev/mapper/SSD_E0_S21_2617528156p1
SSD_E0_S22_2617509504P1 ENABLED /dev/mapper/SSD_E0_S22_2617509504p1
SSD_E0_S23_2617546936P1 ENABLED /dev/mapper/SSD_E0_S23_2617546936p1
...
...
...
INFO: Executing
Start erasing disks on the system
On some platforms, this will take several hours to finish, please wait
Do you want to continue (yes|no) [yes] ?
Number of disks are processing: 0
 
Disk Vendor Model Erase method Status Time(seconds)
e0_pd_00 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_01 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_02 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_03 HGST H7210A520SUN010[18173.972533] reboot: Restarting system
...
...
...

The erase process seems to have been run successfully. This can be checked with checkheader option as well :
[[email protected]_name ~]# perl /opt/oracle/oak/onecmd/cleanup.pl -checkHeader
DCS config is - /opt/oracle/dcs/conf/dcs-agent.jsonINFO: Emulation mode is set to FALSE
Command may take few minutes...
OS Disk Disk Type OAK Header ASM Header(p1) ASM Header(p2) Error: /dev/sdaa: unrecognised disk label
 
/dev/sdaa HDD Erased Erased UnKnownError: /dev/sdab: unrecognised disk label
 
/dev/sdab HDD Erased Erased UnKnownError: /dev/sdac: unrecognised disk label
 
/dev/sdac HDD Erased Erased UnKnownError: /dev/sdad: unrecognised disk label
 
/dev/sdad HDD Erased Erased UnKnownError: /dev/sdae: unrecognised disk label
 
/dev/sdae HDD Erased Erased UnKnownError: /dev/sdaf: unrecognised disk label
 
/dev/sdaf HDD Erased Erased UnKnownError: /dev/sdag: unrecognised disk label
 
...
...
...

The odaeraser python script could even be now executed. Adding no other value than checking as the disks have already been cleanup :
[[email protected]_name ~]# /opt/oracle/oak/bin/odaeraser.py
Start erasing disks on the system
On some platforms, this will take several hours to finish, please wait
Do you want to continue (yes|no) [yes] ? yes
Number of disks are processing: 0
 
Disk Vendor Model Erase method Status Time(seconds)
e0_pd_00 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_01 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_02 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_03 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_04 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_05 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_06 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_07 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_08 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_09 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_10 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_11 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_12 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_13 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_14 HGST H7210A520SUN010T SCSI crypto erase success 0
e0_pd_15 HGST HBCAC2DH2SUN3.2T SCSI crypto erase success 0
e0_pd_16 HGST HBCAC2DH2SUN3.2T SCSI crypto erase success 0
e0_pd_17 HGST HBCAC2DH2SUN3.2T SCSI crypto erase success 0
e0_pd_18 HGST HBCAC2DH2SUN3.2T SCSI crypto erase success 0
e0_pd_19 HGST HBCAC2DH2SUN3.2T SCSI crypto erase success 0
e0_pd_20 HGST HBCAC2DH4SUN800G SCSI crypto erase success 0
e0_pd_21 HGST HBCAC2DH4SUN800G SCSI crypto erase success 0
e0_pd_22 HGST HBCAC2DH4SUN800G SCSI crypto erase success 0
e0_pd_23 HGST HBCAC2DH4SUN800G SCSI crypto erase success 0
[[email protected]_name ~]#

Force option form perl cleanup script

I then tried to create the appliance again. And I was surprised that it was still not possible to create the appliance even after running cleanup.pl script. The same message was displayed even in the web gui : “There was an error in configuring Oracle Database Appliance….For more information about running the cleanup script, see the Deployment and User’s Guide for your model.” See picture below :





The only solution was to use the force option in the cleanup.pl script!!! Running following command solved my problem :
[[email protected]_name ~]# perl /opt/oracle/oak/onecmd/cleanup.pl -f

Then all was ok, even in the gui, as described in below picture :


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Marc Wagner
Marc Wagner

Consultant