Infrastructure at your Service

Franck Pachot

How to measure Oracle index fragmentation

At Oracle Open World 2014, or rather the Oaktable World, Chris Antognini has presented ‘Indexes: Structure, Splits and Free Space Management Internals’. It’s not something new, but it’s still something that is not always well understood: how index space is managed, block splits, fragmentation, coalesce and rebuilds. Kyle Hailey has made a video of it available here.
For me, it is the occasion to share the script I use to see if an index is fragmented or not.

First, forget about those ‘analyze index validate structure’ which locks the table, and the DEL_LEAF_ROWS that counts only the deletion flags that are transient. The problem is not the amount of free space. The problem is where is that free space. Because if you will insert again in the same range of values, then that space will be reused. Wasted space occurs only when lot of rows were deleted in a range where you will not insert again. For exemple, when you purge old ORDERS, then the index on the ORDER_DATE – or on the ORDER_ID coming from a sequence – will be affected. Note that the problem occurs only for sparse purges because full blocks are reclaimed when needed and can get rows from an other range of value.

I have a script that shows the number of rows per block, as well as used and free space per block, and aggregates that by range of values.

First, let’s create a table with a date and an index on it:
drop table DEMOTABLE;
create table DEMOTABLE as select sysdate-900+rownum/1000 order_date,decode(mod(rownum,100),0,'N','Y') delivered , dbms_random.string('U',16) cust_id from (select * from dual connect by level <= 1e4 );
create index DEMOINDEX on DEMOTABLE(ORDER_DATE) pctfree 90;

My script shows 10 buckets with begin and end value and for each of them the averge number of rows per block and the free space:
SQL> @index_fragmentation
 
ORDER_DAT -> ORDER_DAT rows/block bytes/block %free space blocks free
--------- -- --------- ---------- ----------- ----------- ---------- -----
24-APR-12 -> 02-AUG-12 377 7163 11 266
03-AUG-12 -> 11-NOV-12 377 7163 11 266
11-NOV-12 -> 19-FEB-13 377 7163 11 266
19-FEB-13 -> 30-MAY-13 377 7163 11 265
30-MAY-13 -> 07-SEP-13 377 7163 11 265
07-SEP-13 -> 16-DEC-13 377 7163 11 265
16-DEC-13 -> 26-MAR-14 377 7163 11 265
26-MAR-14 -> 03-JUL-14 377 7163 11 265
04-JUL-14 -> 11-OCT-14 377 7163 11 265
12-OCT-14 -> 19-JAN-15 376 7150 11 265

Note that the script reads all the table (it can do a sample but here it is 100%). Not exactly the table but only the index. It counts the index leaf blocks with the undocumented function sys_op_lbid() which is used by oracle to estimate the clustering factor.

So here I have no fragmentation. All blocks have about 377 rows and no free space. This is because I inserted them in increasing order and the so colled ’90-10′ block split occured.

Let’s see what I get if I delete most of the rows before the 01-JAN-2014:
SQL> delete from DEMOTABLE where order_dateSQL> @index_fragmentation
 
ORDER_DAT -> ORDER_DAT rows/block bytes/block %free space blocks free
--------- -- --------- ---------- ----------- ----------- ---------- -----
25-APR-12 -> 02-AUG-12 4 72 99 266 oooo
03-AUG-12 -> 11-NOV-12 4 72 99 266 oooo
11-NOV-12 -> 19-FEB-13 4 72 99 266 oooo
19-FEB-13 -> 30-MAY-13 4 72 99 265 oooo
30-MAY-13 -> 07-SEP-13 4 72 99 265 oooo
07-SEP-13 -> 16-DEC-13 4 72 99 265 oooo
16-DEC-13 -> 26-MAR-14 4 72 99 265 oooo
26-MAR-14 -> 03-JUL-14 4 72 99 265 oooo
04-JUL-14 -> 11-OCT-14 46 870 89 265 oooo
12-OCT-14 -> 19-JAN-15 376 7150 11 265

I have the same buckets, and same number of blocks. But blocks which are in the range below 01-JAN-2014 have only 4 rows and a lot of free space. This is exactly what I want to detect: I can check if that free space will be reused.

Here I know I will not enter any orders with a date in the past, so those blocks will never have an insert into them. I can reclaim that free space with a COALESCE:
SQL> alter index DEMOINDEX coalesce;
Index altered.
SQL> @index_fragmentation
ORDER_DAT to ORDER_DAT rows/block bytes/block %free space blocks free
--------- -- --------- ---------- ----------- ----------- ---------- -----
25-APR-12 -> 03-OCT-14 358 6809 15 32
03-OCT-14 -> 15-OCT-14 377 7163 11 32
15-OCT-14 -> 27-OCT-14 377 7163 11 32
27-OCT-14 -> 08-NOV-14 377 7163 11 32
08-NOV-14 -> 20-NOV-14 377 7163 11 32
20-NOV-14 -> 02-DEC-14 377 7163 11 32
02-DEC-14 -> 14-DEC-14 377 7163 11 32
14-DEC-14 -> 26-DEC-14 377 7163 11 32
27-DEC-14 -> 07-JAN-15 377 7163 11 32
08-JAN-15 -> 19-JAN-15 371 7056 12 32

I still have 10 buckets because this is defined in my script, but each bucket noew has less rows. I’ve defragmented the blocks and reclaimed the free blocks.

Time to share the script now. Here it is: index_fragmentation.zip

The script is quite ugly. It’s SQL generated by PL/SQL. It’s generated because it selects the index columns. And as I don’t want to have it too large it is not indented nor commented. However, if you run it with set servertoutput on you will see the generated query.

How to use it? Just change the owner, table_name, and index name. It reads the whole index so if you have a very large index you may want to change the sample size.

 

7 Comments

  • Hi Franck, this is a great blog I read and it really helps. Although it’s very time consuming for large database (billion of rows), but the result set is excellent. I manage away to capture the dynamic sql into a table so I can use it to globally to rebuild index.

    I have a few questions for you
    1. To reduce the time, what will be the ideal value for the bucket and sample? If I reduce the bucket to 5 and sample to 50, will it give me the same accuracy as 10 and 100?
    2. To determine if index needs to be rebuilt, simply look at the “free space” if it has at least “o” it’s a candidate to rebuild, correct?
    3. I rebuild the index either coalesce, rebuild or rebuild online then run the script again, it still shows the same result as of I need to rebuild that index again. Not sure why or does it suppose to work that way?
    4. How do I determine a partition index?

    Thanks Franck.

    Eddy Soeparmin

     
  • Hi Eddy,
    1. sample is sample of leaf blocks read, buckets is for number of lines displayed. On big index, start with sample 1, then sample 10, see if it is very different. You can even change the generated query to focus on a specific range of values.
    2. No, it’s not because there is free space that you have to rebuild. The range of value is displayed so that you know if that space will be reused or not. the whole idea is to understand why you have lot of free space in some range of value.
    3. when you rebuild, you can expect to have PCTFREE (default 10%) free space. After some DML it will tend to 75% (average between full blocks and 50-50 split)
    Regards,
    Franck.

     
  • Hi Franck,

    I guess I still don’t understand how to read the range of value or how to determine if the index needs to be rebuilt. The generated script shows the field name(s) from a given index file, rows/block, bytes/block, %free space, blocks, and free. If this scripts is run as an Oracle job, how do I determine which index needs to be rebuilt and which one to skip? Do I need to join with other table such dba_indexes, dba_segments, dba_tables??

    Please advised since i’m still new in this area. Thanks Franck.

    Eddy Soeparmin

     
  • Hi Eddy,
    Unfortunately there is nothing that can tell you if you have to rebuild or not. The script gives you the free space but in order to decide if it’s good or not to have that free space, you need to know your application. It depends on the queries that are done (do you read lot of leaf blocks) and it depends on the way data is inserted and updated. You need free space for new entries, or the block will be split and then you will have two half empty blocks.
    This is the reason why I show the range of values. It helps to match unexpected free space with application behavior, purge operations, etc.
    Regards,
    Franck

     
  • Nonra says:

    I executed script, but on a index with more than 30GB I got this error:

    ERROR at line 1:
    ORA-06502: PL/SQL: numeric or value error: character string buffer too small
    ORA-06512: at line 20
    ORA-06512: at line 48

    I tried with change the sample size but doesn’t works.

    Can you help me?

     
  • […] vs. SQL Server Index performance, I ran across this great script to check for fragmentation from Franck Pachot.  You’ll need to simply update the script to declare the table and index name or simply edit […]

     

Leave a Reply


+ 3 = ten

Franck Pachot
Franck Pachot

Technology Leader