Yesterday, I had a discussion with one of my friend who works with Oracle about some different behaviours between each query optimizer against one particular query. You can read his blog post and get the repro script here. I decided to write a blog post because our discussion has introduced some interesting concepts and I wanted to explain more things from the SQL Server side.

First of all let me set the scene that is pretty basic as shown below:

blog 60- 1 - case description

Note the implementation of the foreign keys that is very important because they will allow to the query optimizer to assume safely some assumptions: col2a column values in the Table2 exist in the Col1a column in the Table1 and in the same manner, Col2b column values in the Table2 and the Col3c column values in the Table3.

Now let’s execute the following query with SQL Server as my friend did on Oracle …

select 
	t1.col1a,
	t2.*,
	t3.col3b
from table1 t1
inner join table2 t2 
	on t1.col1a=t2.col2a
inner join table3 t3 
	on t2.col2b=t3.col3b
where t3.col3b=51
	OR t1.col1a=50;

 

… and let’s get the execution plan:

blog 60- 2 - Execution plan

 

My first conclusion was that this execution plan is pretty normal because we want to retrieve values from columns in the 3 tables (projection) and in the same time we use the predicates that concern the columns of both the Table1 and Table3 tables. At this point, I didn’t have in mind the definition of these tables. However, my friend told me that this query can be simplified because of the transitive property of the join predicates and he was right. Indeed, the query optimizer may potentially use them with some transitive closure strategies to produce a more efficient query plan.

The transitive properties can be located here:

t2.col2b = t3.col3b = 51
t2.col2a = t1.col1a = 50

Go ahead and now let’s assume we can also apply some commutative rules here in the projection. The columns below are concerned by the foreign key and we can assume safely that their values are equal

t1.col1a = t2.col2a
t3.col3b = t2.col2b

Then, from these two rules we may now deduce that the following query is equivalent to our first query:

select 
	t2.col2a,
	t2.*,
	t2.col2b
from table1 t1
inner join table2 t2
	 on t1.col1a=t2.col2a
inner join table3 t3 
	on t2.col2b=t3.col3b
where t2.col2b=51
	OR t2.col2a=50;

 

We can still simplify the query as follows …

select 
	t2.*
from table1 t1
inner join table2 t2
	 on t1.col1a=t2.col2a
inner join table3 t3 
	on t2.col2b=t3.col3b
where t2.col2b=51
	OR t2.col2a=50;

 

… and finally get the following execution plan:

blog 60- 3 - Execution plan simplified

 

This is exactly the kind of simplification that my friend got from Oracle’s query optimizer. Let’s go back to the first query and let’s display the join simplification phase of the query optimization process by implementing the query trace 8606

select 
	t1.col1a,
	t2.*,
	t3.col3b
from table1 t1
inner join table2 t2 
	on t1.col1a=t2.col2a
inner join table3 t3 
	on t2.col2b=t3.col3b
where t3.col3b=51
	OR t1.col1a=50
option
(
    querytraceon 3604,  -- Output to console
    querytraceon 8606  -- Logical operators tree
)
go

 

Here, our focus is on Join-collapsed Tree section (the tree after initial join reordering and removing unnecessary joins) and unfortunately we don’t see any relevant stuff performed by the query optimizer (at least what we can expect).

blog 60- 4 - join simplification

 

Let’s take a look at the same phase with the simplified query this time:

select 
	t2.*
from table1 t1
inner join table2 t2
	 on t1.col1a=t2.col2a
inner join table3 t3 
	on t2.col2b=t3.col3b
where t2.col2b=51
	OR t2.col2a=50
option
(
    querytraceon 3604,  -- Output to console
    querytraceon 8606  -- Logical operators tree
)
go

blog 60- 5 - join simplification 2

 

This time the story is not the same because the query optimizer is able to perform the necessary eliminations joins but we had to help it a little bit ..

Hope this helps.

By David Barbarin