Infrastructure at your Service

Introduction

Oracle Database Appliances rely on ASM to manage disk redundancy. And ASM is brilliant. Compared to RAID, redundancy is managed at the block level. For NORMAL redundancy, which is similar to RAID1, you need at least 2 disks, but it can also work with 3 disks, 4 disks, 5 disks and so on. There is no need for parity at the disk level. HIGH redundancy, which does not exist in RAID technology, is basically a triple security. Each block is written on 3 different disks. For this kind of redundancy, you need at least 3 disks, but you can also use 4 disks, 5 disks, 6 disks and so on. You can add and remove disks online, without any downtime, using various degrees of parallelism to increase speed or to lower CPU usage during the rebalancing operations.

RAW space vs usable space

As there is no RAID controler in your ODA, you will see from the system, and more precisely from ASM instance, the RAW space available. For example, on ODA X8-2M with 4 disks, RAW capacity is 25.6TB. This is the free space size you would see on this kind of ODA if there were no databases configured on it. This is not a problem as soon as you understand that you don’t really have these 25.6TB. There is also a usable space notion. One should think it is space available with redundancy being computed, but it’s not exactly that. It can be quite different actually depending on your ODA.

Real world example

For my example, I will use an ODA X8-2M with 4 disks running on 19.6. Redundancy has been set to NORMAL, and DATA/RECO ratio to 90/10. Several databases are running on this ODA. Regarding the spec sheet of this server, the ODA X8-2M comes with 2x 6.4TB disks as standard, and you can add up to 5 expansions, each expansion being a bundle of 2x 6.4TB disks. RAW capacity starts from 12.4TB and goes up to 76.8TB. As you probably know, a 6.4TB disk hasn’t 6.4TB of real usable capacity, so don’t expect to store more than 5.8TB on each disk. But this is not related to ODA. It’s been years that disk manufacturers are writing optimistic sizes on their disks.

I’m using V$ASM_DISKGROUP dynamic view from +ASM1 instance to check available space and free space.

desc v$asm_diskgroup
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 GROUP_NUMBER                                       NUMBER
 NAME                                               VARCHAR2(30)
 SECTOR_SIZE                                        NUMBER
 LOGICAL_SECTOR_SIZE                                NUMBER
 BLOCK_SIZE                                         NUMBER
 ALLOCATION_UNIT_SIZE                               NUMBER
 STATE                                              VARCHAR2(11)
 TYPE                                               VARCHAR2(6)
 TOTAL_MB                                           NUMBER
 FREE_MB                                            NUMBER
 HOT_USED_MB                                        NUMBER
 COLD_USED_MB                                       NUMBER
 REQUIRED_MIRROR_FREE_MB                            NUMBER
 USABLE_FILE_MB                                     NUMBER
 OFFLINE_DISKS                                      NUMBER
 COMPATIBILITY                                      VARCHAR2(60)
 DATABASE_COMPATIBILITY                             VARCHAR2(60)
 VOTING_FILES                                       VARCHAR2(1)
 CON_ID                                             NUMBER

One can guess that real diskgroup free % is normally FREE_MB/TOTAL_MB:

SQL> set lines 200
SQL> select GROUP_NUMBER, NAME, TOTAL_MB, FREE_MB, USABLE_FILE_MB, TYPE from v$asm_diskgroup;

GROUP_NUMBER NAME                             TOTAL_MB    FREE_MB USABLE_FILE_MB TYPE
------------ ------------------------------ ---------- ---------- -------------- ------
           1 DATA                             21977088    9851876        2178802 NORMAL
           2 RECO                              2441216    1421004         405350 NORMAL


Select round(9851876/21977088*100,1) "% Free"  from dual;

    % Free
----------
      44.8


Free space is more than 44% on my ODA. Not bad.

And when I use USABLE_FILE_MB to get another metric for the same thing:

SQL> set lines 200
SQL> select GROUP_NUMBER, NAME, TOTAL_MB, FREE_MB, USABLE_FILE_MB, TYPE from v$asm_diskgroup;

GROUP_NUMBER NAME                             TOTAL_MB    FREE_MB USABLE_FILE_MB TYPE
------------ ------------------------------ ---------- ---------- -------------- ------
           1 DATA                             21977088    9851876        2178802 NORMAL
           2 RECO                              2441216    1421004         405350 NORMAL


Select round(2178801/21977088*100,1) "% Free"  from dual;

    % Free
----------
       9.9

This is bad. According to this metric, I have less than 10% free in that diskgroup. I’m getting anxious… I thought I was fine but I’m now critical?

What is really USABLE_FILE_MB?

When you look into the documentation, it’s quite clear:

  • USABLE_FILE_MB is free MB according to diskgroup redundancy. Among 9’851’876 MB, only half, 4’925’938 MB of data, can be used in that diskgroup. This is for NORMAL redundancy (each block exists on 2 different disks). This is quite relevant regarding what has been said before
  • USABLE_FILE_MB is free MB according to a disk being able to get lost and redundancy would be guaranteed. On this ODA with 4 disks, ¼ of the total disk space shouldn’t be considered as available unlike RAID system (a loss of one disk is not visible by the system). For a total MB of 21’977’088, only 16’482’816 MB should be considered as usable for DATA
  • Finally, USABLE_FILE_MB is the mix of these 2 facts. For NORMAL redundancy, the formula is USABLE_FILE_MB = (FREE_MB – TOTAL_MB/nb_disks) / 2 = (9’851’876 MB – 5’494’272 MB) / 2 = 2’178’802 MB

Let’s take another example to be sure. This time it’s an ODA X8-2M with 6 disks in NORMAL redundancy. Let’s do the math:

SQL> set lines 200
SQL> select GROUP_NUMBER, NAME, TOTAL_MB, FREE_MB, USABLE_FILE_MB, TYPE from v$asm_diskgroup;

GROUP_NUMBER NAME                             TOTAL_MB    FREE_MB USABLE_FILE_MB TYPE
------------ ------------------------------ ---------- ---------- -------------- ------
           1 DATA                             32965632   25028756        9767242 NORMAL
           2 RECO                              3661824    2549852         969774 NORMAL

select round((25028756 - 32965632/6)/2,1) "DATA_USABLE_FILE_MB" from v$asm_diskgroup where name='DATA';

DATA_USABLE_FILE_MB
-------------------
            9767242

The formula is correct.

Should I use USABLE_FILE_MB for monitoring?

That’s a good question. Using USABLE_FILE_MB for monitoring is considering the worst case. Using FREE_MB/TOTAL_MB is considering the best case. Using FREE_MB seems recommended but with lower values than a normal filesystem: WARNING should be triggered when 65/70% is reached, and CRITICAL should be triggered when 80/85% is reached. For 2 reasons: because the volume will be filled 2 times faster than a view through a RAID system (3 times faster with HIGH redundancy) and because when your disks are nearly full, the only way to extend the volume is to buy new disks from Oracle (if you have not reached the limit).

Remember that the only resilience guarantee for an ODA is not having enough space in diskgroups for loosing one disk but having a functional Data Guard configuration. It’s why I never configure HIGH redundancy on ODA, it’s a waste of disk space and it does not provide me much higher failure tolerance (I still have “only” 2 power supplies and 2 network interfaces).

To make it crystal clear, let’s compare again to a RAID system. Imagine you have a 4x 6TB disks RAID1 system. These 4 disks have a RAW capacity of 24TB, but only 12TB are usable. If you loose one disk, 12TB are still usable, but you’ve lost redundancy for half of the data. With ASM in NORMAL redundancy, you can see a total of 24TB, but only 12TB is really available for your databases. But if you look at the USABLE_FILE_MB, you will find that only 9TB is usable, because you redundancy can survive to a disk crash. The RAID is simply not able to do that.

Furthermore, if you want to do the same with RAID1 you could, but it means that you will need 5 disks instead of 4. The fifth one being the spare disk to rebuild redundancy in case of disk failure of one of the four disks.

Should I use storage even if USABLE_FILE_MB is 0?

Yes, you can. But you have to know that if you loose a disk, redundancy cannot be guaranteed anymore. Like if it were on a RAID system. You can also see negative values in USABLE_FILE_MB.

And what about the number of disks?

For sure, the more disk you have, the less space you will “loose” from the USABLE_FILE_MB view. An ODA with 3 or 4 disks with NORMAL redundancy is definitely not very comfortable, but starting from 6 disks, this USABLE_FILE_MB becomes much more convenient.

On a 2-disk ODA with NORMAL redundancy, there is no way of keeping redundancy after loosing a disk. That’s quite obvious. X8-2S and X8-2M with base disk configuration are not that nice for this reason.

Number of disks is not only a matter of storage size you need but also an increased level of security for your databases. The more disks you have, the more disk failure you can survive keeping redundancy (if disks are not having simultaneous failures for sure).

Conclusion

ODA storage is commonly misunderstood, because it does not use classic RAID. ASM is very powerful and more secure than a RAID system. Don’t hesitate to order more disks than needed on your ODAs. Yes it’s expensive but this is a good investment for the next 5 years. And it’s usually cheaper to order additional disks with the ODA than ordering them later.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Jérôme Dubar
Jérôme Dubar

Consultant