Infrastructure at your Service

Franck Pachot

Index on trunc(date) – do you still need old index?

Sometimes we have to index on ( trunc(date) ) because a SQL statement uses predicate on it instead of giving a range from midnight to midnight. When you do that you probably keep the index on the column. That’s two indexes to maintain for DML. Do we need it?

I’ll show a feature that appeared in 11.2.0.2 (info from oracle-l) so let’s set the optimizer to behave as before that feature.

SQL> alter session set optimizer_features_enable='11.2.0.1';
Session altered.

I create the following table with a date column;

SQL> create table DEMO as select prod_id,prod_name,prod_eff_from +rownum/0.3 prod_date from sh.products,(select * from dual connect by 1000˂=level);
Table created.

and I have an index on the date column:

SQL> create index PROD_NAME on DEMO(prod_name);
Index created.

SQL> create index PROD_DATE on DEMO(prod_date);
Index created.

The index on the name is for another blog post…

 trunc(date)=…

It’s quite common to encounter a query that TRUNC the column in order to search for a date – whatever the time component is. We all know that it is better to use a BETWEEN because applying a function an indexed column prevents the index access:

SQL> set autotrace trace explain
SQL> select * from DEMO where trunc(prod_date)=date'2015-01-01';

Execution Plan
----------------------------------------------------------
Plan hash value: 4000794843

---------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    49 |     2   (0)|
|*  1 |  TABLE ACCESS FULL| DEMO |     1 |    49 |     2   (0)|
---------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TRUNC(INTERNAL_FUNCTION("PROD_DATE"))=TO_DATE('
              2015-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

 

But sometimes we can’t change the query and just have to find a workaround. And function based indexes are a gread help for that:

SQL> create index PROD_DATE_TRUNC on DEMO( trunc(prod_date) );
Index created.

and that index can be used for the query above:

SQL> select * from DEMO where trunc(prod_date)=date'2015-01-01';

Execution Plan
----------------------------------------------------------
Plan hash value: 1760965557

------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |     1 |    58 |     1   (0)|
|   1 |  TABLE ACCESS BY INDEX ROWID| DEMO            |     1 |    58 |     1   (0)|
|*  2 |   INDEX RANGE SCAN          | PROD_DATE_TRUNC |     1 |       |     1   (0)|
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(TRUNC(INTERNAL_FUNCTION("PROD_DATE"))=TO_DATE(' 2015-01-01 00:00:00',
              'syyyy-mm-dd hh24:mi:ss'))

The TRUNC(INTERNAL_FUNCTION(“PROD_DATE”)) is still there but now it’s an access predicate instead of a filter predicate. Our new index has been used.

 date > … and date < …

Ok. But now i’ve two indexes instead of one. It’s an overhead when inserting, deleting, and updating that date column. If I’m sure that we query only with the trunc function I can drop it.

SQL> drop index PROD_DATE;
Index dropped.

But what happens if a query was well written, using a range instead of trunc:

SQL> select * from DEMO where prod_date >= date'2015-01-01' and prod_date ˂ date'2015-01-02';

Execution Plan
----------------------------------------------------------
Plan hash value: 4000794843

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    49 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| DEMO |     1 |    49 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_DATE">=TO_DATE(' 2015-01-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "PROD_DATE"˂TO_DATE(' 2015-01-02 00:00:00',
              'syyyy-mm-dd hh24:mi:ss'))

A full table scan. Does that mean that I have to maintain two indexes? That was in 11.2.0.2 but let’s see the behaviour after the next patchset:

SQL> alter session set optimizer_features_enable='11.2.0.2';
Session altered.

SQL> select * from DEMO where prod_date >= date'2015-01-01' and prod_date ˂ date'2015-01-02';

Execution Plan
----------------------------------------------------------
Plan hash value: 1760965557

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |     1 |    58 |     1   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| DEMO            |     1 |    58 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | PROD_DATE_TRUNC |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_DATE">=TO_DATE(' 2015-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "PROD_DATE"˂TO_DATE(' 2015-01-02 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   2 - access(TRUNC(INTERNAL_FUNCTION("PROD_DATE"))>=TRUNC(TO_DATE(' 2015-01-01
              00:00:00', 'syyyy-mm-dd hh24:mi:ss')) AND TRUNC(INTERNAL_FUNCTION("PROD_DATE"))˂=TRUNC(
              TO_DATE(' 2015-01-02 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))

Since 11.2.0.2 we don’t need to keep the old index. The one with the trunc() can be used.

use the time part

However, something is missing now. If we want to select or filter the full date with time, we have to go to the table because the time part is not in our new index:

SQL> select prod_date from DEMO where prod_date >= date'2015-01-01' and prod_date ˂ date'2015-01-02';

Execution Plan
----------------------------------------------------------
Plan hash value: 1760965557

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |     1 |    18 |     1   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| DEMO            |     1 |    18 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | PROD_DATE_TRUNC |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROD_DATE">=TO_DATE(' 2015-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "PROD_DATE"˂TO_DATE(' 2015-01-02 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   2 - access(TRUNC(INTERNAL_FUNCTION("PROD_DATE"))>=TRUNC(TO_DATE(' 2015-01-01
              00:00:00', 'syyyy-mm-dd hh24:mi:ss')) AND TRUNC(INTERNAL_FUNCTION("PROD_DATE"))˂=TRUNC(
              TO_DATE(' 2015-01-02 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))

That TABLE ACCESS BY INDEX ROWID is usually what is expensive in an index access. In that case, do we need to keep the old index?

No, a better solution is to add the date – without a trunc – in our new index:

SQL> drop index PROD_DATE_TRUNC;
Index dropped.

SQL> create index PROD_DATE_TRUNC on DEMO( trunc(prod_date) , prod_date );
Index created.

and now we don’t need to go to the table:

SQL> select prod_date from DEMO where prod_date >= date'2015-01-01' and prod_date ˂ date'2015-01-02';

Execution Plan
----------------------------------------------------------
Plan hash value: 547246927

------------------------------------------------------------------------------------
| Id  | Operation        | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                 |     1 |    18 |     1   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| PROD_DATE_TRUNC |     1 |    18 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(TRUNC(INTERNAL_FUNCTION("PROD_DATE"))>=TRUNC(TO_DATE('
              2015-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) AND "PROD_DATE">=TO_DATE('
              2015-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              TRUNC(INTERNAL_FUNCTION("PROD_DATE"))˂=TRUNC(TO_DATE(' 2015-01-02
              00:00:00', 'syyyy-mm-dd hh24:mi:ss')) AND "PROD_DATE"˂TO_DATE(' 2015-01-02
              00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
       filter("PROD_DATE">=TO_DATE(' 2015-01-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "PROD_DATE"˂TO_DATE(' 2015-01-02 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

And there is one index only to maintain.

Conclusion

From 11.2.0.2 an index on trunc(date) is sufficient for access through predicates on the date without time part – even if we don’t use the trunc() in the predicate. If we need to get the time part without having the overhead of reading the table, then we can add the column without function in the function based index. No need to maintain bot indexes.

 

6 Comments

  • Rainer Stenzel says:

    Hello Frank,
    thank you for this revelation and publication. Do you already know of broader appliance of this substitution/unification technique e.g. UPPER or DECODE ?
    Best regards

     
  • Mohamed Houri says:

    Hi Frank

    Yes that’s an interesting point you have pointed out. I did the same experiment using an index on a virtual column and a query predicate on the column that serves for the virtual column definition
    https://hourim.wordpress.com/2013/10/25/index-on-a-virtual-column-would-it-help-others/
    There is a special extension for the TRUNC function that seems not have been extended to other SQL functions.

     
  • Hi,
    Thanks for the comments.
    As Mohamed shows in his blog post, the TRUNC optimization is also available for numbers.
    There is the same kind of optimization for SUBSTR and I’ll post about it soon. That was planned – reason why I’ve an index on PROD_NAME. Thanks for the reminder ;)
    Regards,
    Franck

     
  • Hi Franck – great post. I was super excited about this feature, but then discovered the Optimizer makes something strange with the costing if the index on trunc(date) is a bitmap index. In my test case the cost was was 100 times higher than a full table scan – so Oracle did not use the bitmap index.

    With identical data – a column with the content of trunc(date) – Oracle uses the bitmap index, although the cost is still significantly higher (10517 ) than a b-tree index (750).

    Test case if you can’t reproduce on your data set:

    create table ckk nologging as
    select rownum id, mod(rownum,5) mod5_id, mod(rownum,5000) mod5000_id, sysdate dt_fixed, sysdate - rownum/24/60 dt_dec, sysdate + rownum/24/60 dt_pos, sysdate + ora_hash(rownum,65,535)/24 dt_rand, sysdate+mod(rownum,10) dt_mod10, rpad('x',3500,'x') filler, trunc(sysdate + rownum/24/60) trunc_dt_pos
    from (select rownum r from dual connect by level <= 10000) r1, (select rownum r from dual connect by level <= 1000)
    ;

    I create multiple indexes for testing – set them invisible.

    create /*+ parallel(24) */ index ckk2$dt_pos_nc2 on ckk2(trunc(dt_pos)) invisible;--26000
    create /*+ parallel(24) */ bitmap index ckk2$dt_pos_B2 on ckk2(trunc(dt_pos)) ;-- 2300
     
    create /*+ parallel(24) */ index ckk2$dt_pos_nc on ckk2(trunc_dt_pos) invisible;--26000
    create /*+ parallel(24) */ bitmap index ckk2$dt_pos_B on ckk2(trunc_dt_pos) ;-- 2300

    Then following query goes Full Table Scan – unless you hint it:

    select /*+INDEX_COMBINE(T) */ sum(mod5_id) from ckk2 t where dt_pos = to_date('2035-02-13 15:33:12','yyyy-mm-dd hh24:mi:ss')

     
  • Jamel says:

    Hello,

    thsnk very much for your post.

    I gave advice to customer in same way.
    Index is used for short list (less than 10 days).
    But for large result (more than 10 days), index is not used at all !

    Kind regards.

     

Leave a Reply


five × = 5

Franck Pachot
Franck Pachot

Technology Leader