Sunday, March 22, 2015

How do you explain this correlated subquery?

The syntax of subqueries is as shown here.

Outer query(some operator) (Inner query)

While non-correlated subqueries are quite easy to understand correlated sub queries
are not quite obvious.

In the case of non-correlated subqueries, the inner query is evaluated first. The value or values returned from inner queries are used in the outer queries based on some operation.

In the case of correlated subquery the outer query is run and for each value returned by the outer query the inner query is evaluated and when the match is found the outer query stops. Usually the inner query just returns either some value or no value and the logical operation is usually Boolean (exists, not exists, any, all, etc.).

The correlated subquery you are asking about is the following posed to the Northwind database:

Use Northwind
Go

Select o.EmployeeID, ShipName, o.OrderID ​
from Orders o​
Where  exists​
(Select i.ShipCity​
from Orders i​
Where i.ShipCity=o.ShipCity and i.EmployeeID in (5,7))

Go

This returns the following result set (here are sme 6 rows out of 674 rows):

CorrelatedSubquery
You are probably intrigued why you are getting employee ID's other than 5 and 7 in
the result.

The reason for this is the inner query result when processed with the logic (exists) returns true or false. If it is true the Orders table returns the columns requested. Inner query result set are not part of the rows returned.

Try the same query replacing 'exists' with 'not exists'. You will find that the subquery returns 156 rows.

The total of all rows returned by the query,

Select o.EmployeeID, o.ShipName, o.OrderID ​
from Orders o​


is 830 which is (156+674).

No comments: