By Franck Pachot
.
I had a comment on object links internals when creating a sharing=object view on GV$SESSION. Before posting about this specific case, I realized that I’ve never explained how a query on an object link is run on the CDB$ROOT container.
Data link testcase
Here is how I create the DEMOT table and DEMOV view as object link.
14:48:58 SQL> connect / as sysdba
Connected.
14:48:59 SQL> alter session set container=CDB$ROOT;
Session altered.
14:48:59 SQL> alter session set "_oracle_script"=true;
Session altered.
14:48:59 SQL> create table DEMOT as select rownum id, rownum num from xmltable('10 to 1000000');
Table created.
SQL> exec dbms_stats.gather_table_stats('','DEMOT');
14:49:00 SQL> create view DEMOV sharing=object as select * from DEMOT;
View created.
14:49:00 SQL> alter session set container=PDB;
Session altered.
14:49:00 SQL> create table DEMOT as select rownum id, rownum num from xmltable('1 to 1');
Table created.
SQL> exec dbms_stats.gather_table_stats('','DEMOT');
14:49:01 SQL> create view DEMOV sharing=object as select * from DEMOT;
View created.
14:49:01 SQL> alter session set "_oracle_script"=false;
Session altered.
And I run the following query two times (easier to look at trace without hard parsing)
14:49:02 SQL> select id from DEMOV where num<=100;
...
100 rows selected.
SQL_TRACE
I’ll detail the sql_trace of the last run.
First, the query is parsed in our PDB:
PARSING IN CURSOR #140360238365672 len=35 dep=0 uid=0 oct=3 lid=0 tim=687080512770 hv=237558885 ad='10cf55ae8' sqlid=' duy45bn72jr35'
select id from DEMOV where num<=100
END OF STMT
PARSE #140360238365672:c=0,e=86,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3055399777,tim=687080512769
EXEC #140360238365672:c=0,e=61,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3055399777,tim=687080512872
There was no hard parse for this second parse call (mis=0 means no ‘Misses in library cache during parse’). Execution occurred but no fetch yet.
At that point, the session switches to CDB$ROOT container (you have to trust me as there is no information about it in the trace file in 12.1)
PARSING IN CURSOR #140360238643592 len=99 dep=1 uid=0 oct=3 lid=0 tim=687080513015 hv=2967959178 ad='107be5590' sqlid=' 3b9x1rasffxna'
SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM "SYS"."DEMOV" "DEMOV" WHERE "DEMOV"."NUM"<=100
END OF STMT
PARSE #140360238643592:c=0,e=33,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=3598352655,tim=687080513014
EXEC #140360238643592:c=0,e=16,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=3598352655,tim=687080513084
FETCH #140360238643592:c=0,e=32,p=0,cr=0,cu=0,mis=0,r=100,dep=1,og=4,plh=3598352655,tim=687080513137
STAT #140360238643592 id=1 cnt=100 pid=0 pos=1 obj=0 op='RESULT CACHE cgn1rxw6ycznac8fyzfursq2z6 (cr=0 pr=0 pw=0 time=12 us)'
STAT #140360238643592 id=2 cnt=0 pid=1 pos=1 obj=98422 op='TABLE ACCESS FULL DEMOT (cr=0 pr=0 pw=0 time=0 us)'
We have here a recursive query (dep=1) that is run on the view in CDB$ROOT. It’s not the same query as ours: FROM clause is our sharing=object view, WHERE clause is the predicates that applies on it, and SELECT clause the columns that we need (ID was in my SELECT clause and NUM was in my WHERE clause). The query is parsed, executed, the 100 rows are fetched and interestingly the result goes to result cache. Yes, if you query V$RESULT_CACHE_OBJECTS in a CDB you will see lot of objects:
#multitenant object links uses result cache internally. AWR views are object links. pic.twitter.com/V5IMW2qQx2
— Franck Pachot (@FranckPachot) August 7, 2016
If you look at the FETCH line above, you see that the second execution was a result cache hit (cr=0)
So, the rows we require from the object link are fetched, then the execution of our query continues in our PDB:
FETCH #140360238365672:c=0,e=235,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,plh=3055399777,tim=687080513194
FETCH #140360238365672:c=0,e=11,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080513375
FETCH #140360238365672:c=0,e=10,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080513586
FETCH #140360238365672:c=0,e=9,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080513776
FETCH #140360238365672:c=0,e=9,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080513983
FETCH #140360238365672:c=0,e=9,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080514188
FETCH #140360238365672:c=0,e=8,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080514375
Up there 91 rows were fetched. We can see in the trace that the recursive cursor is closed there (session switches to CDB$ROOT for that):
CLOSE #140360238643592:c=0,e=3,dep=1,type=0,tim=687080514584
And our session is back on PDB container where the remaining rows are fetched and our cursor closed:
FETCH #140360238365672:c=0,e=40,p=0,cr=0,cu=0,mis=0,r=9,dep=0,og=1,plh=3055399777,tim=687080514610
STAT #140360238365672 id=1 cnt=100 pid=0 pos=1 obj=0 op='FIXED TABLE FULL X$OBLNK$ (cr=0 pr=0 pw=0 time=263 us cost=0 size=13000 card=500)'
CLOSE #140360238365672:c=0,e=3,dep=0,type=0,tim=687080516173
As I’ve explained at DOAGDB16 and SOUG Day (next session is planned for UKOUG TECH16), following metadata and object links is done by the session switching from PDB to CDB$ROOT.
TKPROF
Here is the tkprof of the full trace with two executions
Our query on PDB
SQL ID: duy45bn72jr35 Plan Hash: 3055399777
select id
from
DEMOV where num<=100
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.00 0.00 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 16 0.00 0.00 0 0 0 200
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 20 0.00 0.00 0 0 0 200
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Number of plan statistics captured: 2
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
100 100 100 FIXED TABLE FULL X$OBLNK$ (cr=1040 pr=1034 pw=0 time=42636 us cost=0 size=13000 card=500)
The number of executions and row count is correct, but there’s no logical reads here because all block reads occurred through the recursive query. The execution plan shows are full table scan on X$OBLNK$ which is how object link access path are displayed in 12.1
Query on CDB$ROOT
SQL ID: 3b9x1rasffxna Plan Hash: 3598352655
SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM
FROM
"SYS"."DEMOV" "DEMOV" WHERE "DEMOV"."NUM"<=100
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.00 0.00 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 2 0.03 0.08 2068 2074 0 200
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 6 0.03 0.08 2068 2074 0 200
Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: SYS (recursive depth: 1)
Number of plan statistics captured: 2
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
100 100 100 RESULT CACHE cgn1rxw6ycznac8fyzfursq2z6 (cr=1037 pr=1034 pw=0 time=662 us)
100 50 100 TABLE ACCESS FULL DEMOT (cr=1037 pr=1034 pw=0 time=522 us)
Here is where the work to get rows from the view is, in CDB$ROOT. Only two fetches there (one per execution). As we have seen in the row trace, all rows from the object link were fetched before we issue any fetch call from our query. I did same test with more rows selected and it seems that the fetch size is 200 rows: when 200 rows are fetched from CDB$ROOT, session switches back to PDB to fetch those rows (15 by 15 with the default sqlplus arraysize) and comes again to CDB$ROOT for next 200 rows. This means that they are probably buffered.
Actually there’s a hidden parameter to define that: “_cdb_view_prefetch_batch_size” is set to 200 by default.
Note that the 2000 logical reads are from the first execution only because the second one found the result in result cache.
V$SQL
From SQL_TRACE, the work done in the other container is not included in statement statistics. This makes tuning more difficult as we are used to see recursive work cumulated in the top level statement.
From CDB$ROOT here is what we can see from the shared pool (V$SQL) about the queries I’ve seen in the SQL_TRACE. I query V$SQL with the PLAN_HASH_VALUE (‘phv’ in the SQL_TRACE dump).
14:49:02 SQL> select con_id,sql_id,rows_processed,plan_hash_value,executions,parse_calls,fetches,buffer_gets,sql_text from v$sql where plan_hash_value in (3598352655,3055399777) order by last_active_time;
CON_ID SQL_ID ROWS_PROCESSED PLAN_HASH_VALUE EXECUTIONS PARSE_CALLS FETCHES BUFFER_GETS SQL_TEXT
---------- ------------- -------------- --------------- ---------- ----------- ---------- ----------- --------------------------------------------------------------------------------
4 duy45bn72jr35 200 3055399777 2 2 16 2721 select id from DEMOV where num<=100
4 bynmh7xm4bf54 0 3598352655 0 1 0 61 SELECT * FROM NO_OBJECT_LINK("SYS"."DEMOV")
1 3b9x1rasffxna 200 3598352655 2 2 2 2080 SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM "SYS"."DEMOV" "DEMOV" WHERE
For the two queries we have seen above, the number of executions and fetches matches what we have seen in the trace. However, buffer_gets from the user query includes the logical reads from the recursive query.
But there’s more here. A statement is there with the same PLAN_HASH_VALUE than the internal query. It’s the query on the shared=object view, with the undocumented NO_OBJECT_LINK() function. It is parsed but not executed. This parse occurred in PDB just before switching to CDB$ROOT. This parse occurend only one time when our query was hard parsed. It has the same plan hash value than the internal query because the plan is the same: full table scan on the table.
My understanding of that is that when hard parsing our query and executing the recursive query on CDB$ROOT, the optimizer checks the view definition in the current container (the PDB) by parsing it without following object links (reason for the NO_OBJECT_LINK).
NO_OBJECT_LINK
Here is the parsing of that query with NO_OBJECT_LINK that occurs in the PDB:
PARSING IN CURSOR #140360238422472 len=43 dep=1 uid=0 oct=3 lid=0 tim=687080413554 hv=1715845284 ad='108fc0230' sqlid=' bynmh7xm4bf54'
SELECT * FROM NO_OBJECT_LINK("SYS"."DEMOV")
END OF STMT
PARSE #140360238422472:c=3000,e=2948,p=0,cr=61,cu=0,mis=1,r=0,dep=1,og=4,plh=3598352655,tim=687080413553
CLOSE #140360238422472:c=0,e=3,dep=1,type=0,tim=687080413614
There is no where clause here. I guess that the goal is just to validate the view in the PDB before executing the full query on CDB$ROOT.
Note that query in the CDB$ROOT do not use the NO_OBJECT_LINK here in 12.1 but could have use it to ensure that there are no further links.
Execution plans
With all those recursive queries, how the cardinalities are estimated? DEMOT has no rows in PDB and 1000000 rows in CDB$ROOT. Statistics gathered and I query only 100 rows (they are evenly distributed between low and high value);
The query that is only parsed in PDB:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID bynmh7xm4bf54, child number 0
-------------------------------------
SELECT * FROM NO_OBJECT_LINK("SYS"."DEMOV")
Plan hash value: 3598352655
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | TABLE ACCESS FULL| DEMOT | 1 | 6 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------
The query that is run in CDB$ROOT:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID 3b9x1rasffxna, child number 0
-------------------------------------
SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM "SYS"."DEMOV"
"DEMOV" WHERE "DEMOV"."NUM"<=100
Plan hash value: 3598352655
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 568 (100)| |
| 1 | RESULT CACHE | 9cv1sbwyz16651fgh17234v67g | | | | |
|* 2 | TABLE ACCESS FULL| DEMOT | 100 | 1000 | 568 (2)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("NUM"<=100)
Estimation is ok here.
And my user query:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID duy45bn72jr35, child number 0
-------------------------------------
select id from DEMOV where num<=100
Plan hash value: 3055399777
------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 (100)|
|* 1 | FIXED TABLE FULL| X$OBLNK$ | 500 | 13000 | 0 (0)|
------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NUM"<=100)
Here, this estimation is not accurate at all, seems to have an hardcoded value of 500.
Conclusion
Lot of interesting things here. Object link (that you can call data links as well) are processed in a very special way. But don’t worry. Remember that you are not allowed to create them yourself. And there are only very few oracle maintained object links: some PDB information that must be available when PDB are not opened, AWR information that is consolidated in root, some audit reference tables,…
It interesting to see (and think about all consequences) that result cache is used here for internal optimization. Even when you don’t use result cache for your application, you should have a look at it and maybe size it differently than default. In a future post I’ll create a sharing=object view on V$ fixed views and result cache will be even more fun.