By Franck Pachot

.
The 12c adaptive plan prepares two join methods (Hash Join and Nested Loop), actives the one that has the better cost for the estimated cardinality and computes the point of inflection in cardinality estimation where the best cost changes to the other join method. At execution time, rows are buffered by a STATISTICS COLLECTOR operation in order to see if the point of inflection is reached. If it doesn’t, the plan continues as planned. If it does, the alternative join method is activated. But buffering has a limit…

Let’s try to find this limit empirically.

I create a table with enough rows:


SQL> create table demo1 (n constraint demo1pk primary key,x1) as select 0 , cast('x' as varchar2(4000)) from dual;
Table created.
SQL> insert --+ append
    into demo1 select 1e7+rownum ,'x' from xmltable('1 to 200000');
200000 rows created.

and a second table to join:


SQL> create table demo2 (n constraint demo2pk primary key,x2) as select 0 , 'x' from dual;
Table created.

I filled the DEMO1 table in two steps. First, CTAS with one row so that the statistics (online statistics gathering) favors nested loops. And I inserted lot of rows later because I want to fill the Adaptive Plan buffer. DEMO2 is a small table but I want the FULL TABLE SCAN on it to be a bit more expensive or hash join will be always choosen. I do that by faking the number of blocks:


SQL> exec dbms_stats.set_table_stats(user,'DEMO2',numblks=3000,no_invalidate=>false);
PL/SQL procedure successfully completed.

If I check the execution plan I see that NESTED LOOP is chosen because estimated number of rows is small (artificially set to 1 row):


SQL> explain plan for
  2  select max(x1),max(x2) from demo1 join demo2 using(n);
 
Explained.
 
SQL> select * from table(dbms_xplan.display(format=>'+adaptive'));
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 604214593
 
--------------------------------------------------------------------------------------------
|   Id  | Operation                      | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT               |         |     1 |     8 |     4   (0)| 00:00:01 |
|     1 |  SORT AGGREGATE                |         |     1 |     8 |            |          |
|- *  2 |   HASH JOIN                    |         |     1 |     8 |     4   (0)| 00:00:01 |
|     3 |    NESTED LOOPS                |         |     1 |     8 |     4   (0)| 00:00:01 |
|     4 |     NESTED LOOPS               |         |     1 |     8 |     4   (0)| 00:00:01 |
|-    5 |      STATISTICS COLLECTOR      |         |       |       |            |          |
|     6 |       TABLE ACCESS FULL        | DEMO1   |     1 |     4 |     3   (0)| 00:00:01 |
|  *  7 |      INDEX UNIQUE SCAN         | DEMO2PK |     1 |       |     0   (0)| 00:00:01 |
|     8 |     TABLE ACCESS BY INDEX ROWID| DEMO2   |     1 |     4 |     1   (0)| 00:00:01 |
|-    9 |    TABLE ACCESS FULL           | DEMO2   |     1 |     4 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("DEMO1"."N"="DEMO2"."N")
   7 - access("DEMO1"."N"="DEMO2"."N")
 
Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)
 

But the plan is adaptive and can switch to HASH JOIN of more rows than expected are encountered by STATISTICS COLLECTOR.

I run it and gather run time statistics


SQL> alter session set statistics_level=all;
Session altered.
SQL> select max(x1),max(x2) from demo1 join demo2 using(n);
MAX(X1)
----------------------------------------------------------------------------------------------------------------------------------------------------------------
M
-
x
x

And here is the adaptive plan: Hash Join is activated because we have actually lot of rows (200000):


SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  d2y436sr1cx3r, child number 0
-------------------------------------
select max(x1),max(x2) from demo1 join demo2 using(n)
 
Plan hash value: 740165205
 
----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |      1 |        |      1 |00:00:02.04 |     372 |    423 |    483 |       |       |          |         |
|   1 |  SORT AGGREGATE     |       |      1 |      1 |      1 |00:00:02.04 |     372 |    423 |    483 |       |       |          |         |
|*  2 |   HASH JOIN         |       |      1 |      1 |      1 |00:00:02.04 |     372 |    423 |    483 |    11M|  4521K| 1262K (1)|    4096 |
|   3 |    TABLE ACCESS FULL| DEMO1 |      1 |      1 |    200K|00:00:00.20 |     369 |    360 |      0 |       |       |          |         |
|   4 |    TABLE ACCESS FULL| DEMO2 |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |      0 |       |       |          |         |
----------------------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("DEMO1"."N"="DEMO2"."N")
   7 - access("DEMO1"."N"="DEMO2"."N")
 
Note
-----
   - this is an adaptive plan

The point of inflection is 814:


SQL> column tracefile new_value tracefile
SQL> alter session set tracefile_identifier='cbo_trace';
Session altered.
 
SQL> select tracefile from v$process where addr=(select paddr from v$session where sid=sys_context('userenv','sid'));
 
TRACEFILE
----------------------------------------------------------------------------------------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ora_3979_cbo_trace.trc
 
SQL> host > &tracefile.
 
SQL> exec dbms_sqldiag.dump_trace(p_sql_id=>'d2y436sr1cx3r',p_child_number=>0,p_component=>'Compiler',p_file_id=>'');
PL/SQL procedure successfully completed.
 
SQL> host grep -E "^DP" &tracefile. | tail
DP - distinct placement
DP: Found point of inflection for NLJ vs. HJ: card = 814.00

Filling the buffer

So here, 814 rows were buffered and the plan switched to HASH JOIN. I want to know how many rows can be buffered, so I want to increase the point of inflection. Easy, if the cost of DEMO2 full table scan is higher then the NESTED LOOP will be cheaper than HASH JOIN even with more rows. Let’s fake the DEMO2 statistics to show a larger table:


SQL> exec dbms_stats.set_table_stats(user,'DEMO2',numblks=4000,no_invalidate=>false);
PL/SQL procedure successfully completed.

And let’s run that again:


SQL> select max(x1),max(x2) from demo1 join demo2 using(n);
 
MAX(X1)
----------------------------------------------------------------------------------------------------------------------------------------------------------------
M
-
x
x
 
SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  d2y436sr1cx3r, child number 0
-------------------------------------
select max(x1),max(x2) from demo1 join demo2 using(n)
 
Plan hash value: 604214593
 
------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |      1 |        |      1 |00:00:01.57 |     374 |    360 |
|   1 |  SORT AGGREGATE               |         |      1 |      1 |      1 |00:00:01.57 |     374 |    360 |
|   2 |   NESTED LOOPS                |         |      1 |      1 |      1 |00:00:01.57 |     374 |    360 |
|   3 |    NESTED LOOPS               |         |      1 |      1 |      1 |00:00:01.57 |     373 |    360 |
|   4 |     TABLE ACCESS FULL         | DEMO1   |      1 |      1 |    200K|00:00:00.20 |     369 |    360 |
|*  5 |     INDEX UNIQUE SCAN         | DEMO2PK |    200K|      1 |      1 |00:00:00.42 |       4 |      0 |
|   6 |    TABLE ACCESS BY INDEX ROWID| DEMO2   |      1 |      1 |      1 |00:00:00.01 |       1 |      0 |
------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   5 - access("DEMO1"."N"="DEMO2"."N")
 
Note
-----
   - this is an adaptive plan
 
27 rows selected.
 
SQL> column tracefile new_value tracefile
SQL> alter session set tracefile_identifier='cbo_trace';
Session altered.
 
SQL> select tracefile from v$process where addr=(select paddr from v$session where sid=sys_context('userenv','sid'));
 
TRACEFILE
----------------------------------------------------------------------------------------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ora_4083_cbo_trace.trc
 
SQL> host > &tracefile.
SQL> exec dbms_sqldiag.dump_trace(p_sql_id=>'d2y436sr1cx3r',p_child_number=>0,p_component=>'Compiler',p_file_id=>'');
PL/SQL procedure successfully completed.
 
SQL> host grep -E "^DP" &tracefile. | tail
DP - distinct placement
DP: Found point of inflection for NLJ vs. HJ: card = 1086.00

Read it from the end:

  1. The inflection point is higher: 1086, which was my goal
  2. The number of rows from DEMO1 is still 200000 rows, so it’s higher than the inflection point.
  3. We expect a HASH JOIN because the number of rows is higher than the inflection point
  4. But the plan stayed with NESTED LOOP because the buffering in STATISTICS COLLECTOR never reached the inflection point

Dichotomy

By Dichotomy, I’ve scripted similar tests to find the point where reaching the point of inflection do not trigger a plan switch.
‘JOIN’ is the method chosen (from dbms_xplan.display_cursor after execution), ‘INFLECTION POINT’ is the one gathered from 10053 trace and ‘STATBLKS’ is the numblks I set for DEMO2 in order to vary the point of inflection.


                JOIN     INFLECTION POINT       HASH_AREA_SIZE               BUFFER             STATBLKS                 LPAD
              NESTED               271889              1000000              2175117              1000000                    1
              NESTED               135823              1000000              1086590               500000                    1
              NESTED                67789              1000000               542319               250000                    1
              NESTED                33885              1000000               271087               125000                    1
              NESTED                16943              1000000               135551                62500                    1
              NESTED                 8471              1000000                67775                31250                    1
              NESTED                 4238              1000000                33904                15625                    1
              NESTED                 2120              1000000                16960                 7813                    1
              NESTED                 1060              1000000                 8480                 3907                    1
                HASH                  532              1000000                 4256                 1954                    1
                HASH                  796              1000000                 6368                 2930                    1
                HASH                  928              1000000                 7424                 3418                    1
                HASH                  994              1000000                 7952                 3662                    1
                HASH                 1026              1000000                 8208                 3784                    1
              NESTED                 1044              1000000                 8352                 3845                    1
                HASH                 1036              1000000                 8288                 3814                    1
                HASH                 1040              1000000                 8320                 3829                    1
              NESTED                 1042              1000000                 8336                 3837                    1
                HASH                 1040              1000000                 8320                 3833                    1
                HASH                 1040              1000000                 8320                 3835                    1
              NESTED                 1042              1000000                 8336                 3836                    1

I’ve added some variations on hash_area_size (my bad guess that it makes sense to buffer up to that amount because this is what will go to hash area size at least, if hash join is finally chosen) and on the DEMO1 row size (by varying an lpad on column X).
For the moment, when point of inflection is less than 1041 a plan switch occurs and when it is higher than 1042 no plan switch occurs.

But there are probably other parameters influencing because:

Any idea welcome…

Update 30 mins. later

Thanks to Chris Antognini. It appears the parameter that influences the number of rows buffered is not the actual size of the row. The number of rows to buffer is calculated from the theoretical size of the columns. Which is very bad in my opinion given the number of applications that declares column size at their maximum. And I see no reason why this has to be set like that. Rows are of variable size and allocating buffers on column definition is not a good idea. That reminds jdbc fetch size very well described by Sigrid Keydana.

Here is the limit for different size of the varchar2:


                JOIN     INFLECTION POINT       HASH_AREA_SIZE          BUFFER SIZE             STATBLKS         VARCHAR SIZE
              NESTED               155345                65536              3572954               572641                    1
              NESTED                74899                65536              3894796               276092                   30
              NESTED                63551                65536              3940220               234258                   40
              NESTED                33289                65536              4061376               122802                  100
              NESTED                12865                65536              4142848                47453                  300
              NESTED                 7973                65536              4162422                29409                  500
              NESTED                 4090                65536              4179980                15079                 1000
              NESTED                 2072                65536              4189584                 7635                 2000
              NESTED                 1388                65536              4194536                 5110                 3000

So it looks like oracle allocates a buffer of about few MB, calculates how many rows can fit there given their column definition, and limits the buffering to that number of rows.
The nonsense in my opinion is that size calculated from column definition can be calculated at parse time, when the point of inflection is determined. It makes no sense to set a point of inflection higher than the number of rows that can be buffered.