Infrastructure at your Service

Daniel Westermann

How to destroy your performance: PL/SQL vs SQL

Disclaimer: This is in no way a recommendation to avoid PL/SQL. This post just describes a case I faced at a customer with a specific implementation in PL/SQL the customer (and me) believed is the most efficient way of doing it in PL/SQL. This was a very good example for myself to remind me to check the documentation and to verify if what I believed a feature does is really what the feature is actually doing. When I was doing PL/SQL full time in one my of previous jobs I used the feature heavily without really thinking on what happened in the background. Always keep learning …

Lets start by building the test case. The issue was on 12.1.0.2 on Linux but I think this will be reproducible on any release (although, never be sure :) ).

SQL> create table t1 as select * from dba_objects;
SQL> insert into t1 select * from t1;
SQL> /
SQL> /
SQL> /
SQL> /
SQL> /
SQL commit;
SQL> select count(*) from t1;

  COUNT(*)
----------
   5565632

SQL> create table t2 as select object_id from t1 where mod(object_id,33)=0;
SQL> select count(*) from t2;

  COUNT(*)
----------
    168896

This are my two tables used for the test: t1 contains around 5,5 millions rows and there is t2 which contains 168896 rows. Coming to the issue: There is a procedure which does this:

create or replace procedure test_update
is
  cursor c1 is select object_id from t2;
  type tab is table of t2.object_id%type index by pls_integer;
  ltab tab;
begin
  open c1;
  fetch c1 bulk collect into ltab;
  close c1;
  forall indx in 1..ltab.count
    update t1 set owner = 'AAA' where object_id = ltab(indx);
end test_update;
/

The procedure uses “bulk collect” and “forall” to fetch the keys from t2 in a first step and then uses these keys to update t1 in a second step. Seemed pretty well done: Not a loop over each single row, compare with the list and then do the update when there is a match. I really couldn’t see an issue here. But when you execute this procedure you’ll wait for ages (at least if you are in VM running on a notebook and not on super fast hardware).

The situation at the customer was that I was told that the update, when executed as plain SQL in sqlplus, takes less than a second. And really, when you execute this on the test case from above:

SQL> update t1 set owner = 'AAA' where object_id in ( select object_id from t2 );

168896 rows updated.

Elapsed: 00:00:05.30
SQL> rollback;

Rollback complete.

Elapsed: 00:00:02.44
SQL> update t1 set owner = 'AAA' where object_id in ( select object_id from t2 );

168896 rows updated.

Elapsed: 00:00:06.34
SQL> rollback;

Rollback complete.

Elapsed: 00:00:02.70
SQL>

It is quite fast (between 5 and 6 seconds on my environment). So why is the PL/SQL version so much slower? Aren’t “bulk collect” and “forall” the right methods to boost performance? Lets take a look at the execution plan for the plain SQL version:

----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |    A-Time     | Buffers | Reads  |  OMem |  1Mem |  O/1/M|
----------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT      |          |      1 |       |       | 24303 (100)|          |       0 |00:00:04.52    |     259K|   9325 |       |       |       |
|   1 |  UPDATE               | T1       |      1 |       |       |            |          |       0 |00:00:04.52    |     259K|   9325 |       |       |       |
|*  2 |   HASH JOIN           |          |      1 |    48 |  4416 | 24303   (1)| 00:00:01 |     168K|00:00:01.76    |   86719 |   9325 |  2293K|  2293K|  1/0/0|
|   3 |    VIEW               | VW_NSO_1 |      1 |   161K|  2044K|    72   (2)| 00:00:01 |    2639 |00:00:00.05    |     261 |     78 |       |       |       |
|   4 |     SORT UNIQUE       |          |      1 |     1 |  2044K|            |          |    2639 |00:00:00.04    |     261 |     78 |   142K|   142K|  1/0/0|
|   5 |      TABLE ACCESS FULL| T2       |      1 |   161K|  2044K|    72   (2)| 00:00:01 |     168K|00:00:00.01    |     261 |     78 |       |       |       |
|   6 |    TABLE ACCESS FULL  | T1       |      1 |  5700K|   429M| 23453   (1)| 00:00:01 |    5566K|00:00:05.88    |   86458 |   9247 |       |       |       |
----------------------------------------------------------------------------------------------------------------------------------------------------------------

It is doing a hash join as expected. What about the PL/SQL version? It is doing this:

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------
SQL_ID  4hh65t1u4basp, child number 0
-------------------------------------
UPDATE T1 SET OWNER = 'AAA' WHERE OBJECT_ID = :B1

Plan hash value: 2927627013

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |      |       |       | 23459 (100)|          |
|   1 |  UPDATE            | T1   |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |   951 | 75129 | 23459   (1)| 00:00:01 |
---------------------------------------------------------------------------

Uh! Why that? This is what I wasn’t aware of. I always thought when you use “forall” to send PL/SQL’s SQL to the SQL engine Oracle would rewrite the statement to expand the list in the where clause or do other optimizations. But this does not happen. The only optimization that takes place when you use “forall” is that the statements are send in batches to the SQL engine rather then sending each statement after another. What happens here is that you execute 168896 full table scans because the same statement (with a another bind variable value) is executed 168896 times. Can’t be really fast compared to the SQL version.

Of course you could rewrite the procedure to do the same as the SQL but this is not the point here. The point is: When you think what you have implemented in PL/SQL is the same as what you compare to when you run it SQL: Better think twice and even better read the f* manuals, even when you think you are sure what a feature really does :)

Leave a Reply

Daniel Westermann
Daniel Westermann

Senior Consultant and Technology Leader Open Infrastructure