tags:

views:

1879

answers:

5

Which of these queries is the faster?

NOT EXISTS:

SELECT ProductID, ProductName 
FROM Northwind..Products p
WHERE NOT EXISTS (SELECT 1 FROM Northwind..[Order Details] od WHERE p.ProductId = od.ProductId)

Or NOT IN:

SELECT ProductID, ProductName 
FROM Northwind..Products p
WHERE p.ProductID NOT IN (SELECT ProductID FROM Northwind..[Order Details])

The query execution plan says they both do the same thing. If that is the case, which is the recommended form?

This is based on the NorthWind database.

[Edit]

Just found this helpful article: http://weblogs.sqlteam.com/mladenp/archive/2007/05/18/60210.aspx

I think I'll stick with NOT EXISTS.

+6  A: 

If the execution planner says they're the same, they're the same. Use whichever one will make your intention more obvious -- in this case, the second.

John Millikin
+1  A: 

It depends..

SELECT x.col
FROM big_table x
WHERE x.key IN( SELECT key FROM really_big_table );

would not be relatively slow the isn't much to limit size of what the query check to see if they key is in. EXISTS would be preferable in this case.

But, depending on the DBMS's optimizer, this could be no different.

As an example of when EXISTS is better

SELECT x.col
FROM big_table x
WHERE EXISTS( SELECT key FROM really_big_table WHERE key = x.key);
  AND id = very_limiting_criteria
Greg Ogle
+4  A: 

Actually, I believe this would be the fastest:

SELECT ProductID, ProductName 
    FROM Northwind..Products p  
          outer join Northwind..[Order Details] od on p.ProductId = od.ProductId)
WHERE od.ProductId is null
James Curran
Might not be the fastest when the optimizer is doing it's job, but certainly will be faster when it's not.
Cade Roux
He may have simplified his query for this post too
Kip
Agree Left outer join is often faster than a subquery.
HLGEM
+3  A: 

In your specific example they are the same, because the optimizer has figured out what you are trying to do is the same in both examples. But it is possible that in non-trivial examples the optimizer may not do this, and in that case there are reasons to prefer one to other on occasion.

NOT IN should be preferred if you are testing multiple rows in your outer select. The subquery inside the NOT IN statement can be evaluated at the beginning of the execution, and the temporary table can be checked against each value in the outer select, rather than re-running the subselect every time as would be required with the NOT EXISTS statement.

If the subquery must be correlated with the outer select, then NOT EXISTS may be preferable, since the optimizer may discover a simplification that prevents the creation of any temporary tables to perform the same function.

Jeffrey L Whitledge
A: 

If the optimizer says they are the same then consider the human factor. I prefer to see NOT EXISTS :)

onedaywhen