Infrastructure at your Service

Marc Wagner

ODA hang/crash due to software raid-check

Oracle Database Appliance (ODA) is by default configured with software raid for Operating System and Oracle Database software file system (2 internal SSD disks). 2 raid devices are configured : md0 and md1.ODA are configured to run raid-check every Sunday at 1am.

Analysing the problem

In case the ODA is having some load during raid-check, it can happen that the server freezes. Only IP layer seems to still be alive : server is replying to the ping command, but ssh layer is not available any more.
Nothing can be done with the ODA : no ssh connection, all logs and writes on the server are stuck, ILOM serial connection is impossible.

The only solution is to power cycle the ODA through ILOM.

Problem could be reproduced on customer side by running 2 RMAN database backups and manually executing the raid-check.

In /var/log/messages we can see that server hung doing raid-check on md1 :

Oct 27 01:00:01 ODA02 kernel: [6245829.462343] md: data-check of RAID array md0
Oct 27 01:00:01 ODA02 kernel: [6245829.462347] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Oct 27 01:00:01 ODA02 kernel: [6245829.462349] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
Oct 27 01:00:01 ODA02 kernel: [6245829.462364] md: using 128k window, over a total of 511936k.
Oct 27 01:00:04 ODA02 kernel: [6245832.154108] md: md0: data-check done.
Oct 27 01:01:02 ODA02 kernel: [6245890.375430] md: data-check of RAID array md1
Oct 27 01:01:02 ODA02 kernel: [6245890.375433] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Oct 27 01:01:02 ODA02 kernel: [6245890.375435] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
Oct 27 01:01:02 ODA02 kernel: [6245890.375452] md: using 128k window, over a total of 467694592k.
Oct 27 04:48:07 ODA02 kernel: imklog 5.8.10, log source = /proc/kmsg started. ==> Restart of ODA with ILOM, server freezed on data-check of RAID array md1
Oct 27 04:48:07 ODA02 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="5788" x-info="http://www.rsyslog.com"] start
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Initializing cgroup subsys cpuset
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Initializing cgroup subsys cpu
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Initializing cgroup subsys cpuacct
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Linux version 4.1.12-124.20.3.el6uek.x86_64 ([email protected]) (gcc version 4.9.2 20150212 (Red Hat 4.9.2-6.2.0.3) (GCC) ) #2 SMP Thu Oct 11 17:47:32 PDT 2018
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Command line: ro root=/dev/mapper/VolGroupSys-LogVolRoot rd_NO_LUKS rd_MD_UUID=424664a7:c29524e9:c7e10fcf:d893414e rd_LVM_LV=VolGroupSys/LogVolRoot rd_LVM_LV=VolGroupSys/LogVolSwap SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM pci=noaer [email protected] loglevel=3 panic=60 transparent_hugepage=never biosdevname=1 ipv6.disable=1 intel_idle.max_cstate=1 nofloppy nomce numa=off console=ttyS0,115200n8 console

Solution

Reduce raid check CPU and IO priority

By default raid check is configured with low priority. Setting the priority to idle would ensure to limit the resource used by the check.

Change NICE=low to NICE=idle in /etc/sysconfig/raid-check configuration file.

[[email protected] log]# cat /etc/sysconfig/raid-check
#!/bin/bash
#
# Configuration file for /usr/sbin/raid-check
#
# options:
# ENABLED - must be yes in order for the raid check to proceed
# CHECK - can be either check or repair depending on the type of
# operation the user desires. A check operation will scan
# the drives looking for bad sectors and automatically
# repairing only bad sectors. If it finds good sectors that
# contain bad data (meaning that the data in a sector does
# not agree with what the data from another disk indicates
# the data should be, for example the parity block + the other
# data blocks would cause us to think that this data block
# is incorrect), then it does nothing but increments the
# counter in the file /sys/block/$dev/md/mismatch_count.
# This allows the sysadmin to inspect the data in the sector
# and the data that would be produced by rebuilding the
# sector from redundant information and pick the correct
# data to keep. The repair option does the same thing, but
# when it encounters a mismatch in the data, it automatically
# updates the data to be consistent. However, since we really
# don't know whether it's the parity or the data block that's
# correct (or which data block in the case of raid1), it's
# luck of the draw whether or not the user gets the right
# data instead of the bad data. This option is the default
# option for devices not listed in either CHECK_DEVS or
# REPAIR_DEVS.
# CHECK_DEVS - a space delimited list of devs that the user specifically
# wants to run a check operation on.
# REPAIR_DEVS - a space delimited list of devs that the user
# specifically wants to run a repair on.
# SKIP_DEVS - a space delimited list of devs that should be skipped
# NICE - Change the raid check CPU and IO priority in order to make
# the system more responsive during lengthy checks. Valid
# values are high, normal, low, idle.
# MAXCONCURENT - Limit the number of devices to be checked at a time.
# By default all devices will be checked at the same time.
#
# Note: the raid-check script is run by the /etc/cron.d/raid-check cron job.
# Users may modify the frequency and timing at which raid-check is run by
# editing that cron job and their changes will be preserved across updates
# to the mdadm package.
#
# Note2: you can not use symbolic names for the raid devices, such as you
# /dev/md/root. The names used in this file must match the names seen in
# /proc/mdstat and in /sys/block.
 
ENABLED=yes
CHECK=check
NICE=idle
# To check devs /dev/md0 and /dev/md3, use "md0 md3"
CHECK_DEVS=""
REPAIR_DEVS=""
SKIP_DEVS=""
MAXCONCURRENT=

Change raid-check scheduling

Configure raid-check to be run in low activity period. Avoid running raid check during database backup periods for example.

[[email protected] ~]# cd /etc/cron.d
 
[[email protected] cron.d]# cat raid-check
# Run system wide raid-check once a week on Sunday at 1am by default
0 1 * * Sun root /usr/sbin/raid-check
 
[[email protected] cron.d]# vi raid-check
 
[[email protected] cron.d]# cat raid-check
# Run system wide raid-check once a week on Sunday at 1am by default
0 19 * * Sat root /usr/sbin/raid-check

Conclusion

These configuration changes could be successfully tested on customer environment. No crash/hang was experienced with NICE parameter set to idle.
As per the oracle documentation, the ODA BIOS default configuration could be change to use hardware raid.
ODA – configuring RAID
The question would be if patching an ODA is still possible afterwards. If you would like to changed this configuration I would strongly recommend you to get Oracle support approval.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Marc Wagner
Marc Wagner

Consultant