views:

451

answers:

5

I'm curious which of the following below would be more efficient? I've always been a bit cautious about using IN because I believe SQL Server turns the result set into a big IF statement. For a large result set this could result in poor performance. For small results sets, I'm not sure either is preferable. For large result sets, wouldn't EXISTS be more efficient?

WHERE EXISTS (SELECT * FROM Base WHERE bx.BoxID = Base.BoxID AND [Rank] = 2)

vs.

WHERE bx.BoxID IN (SELECT BoxID FROM Base WHERE [Rank = 2])
A: 

Off the top of my head and not guaranteed to be correct: I believe the second will be faster in this case.

  1. In the first, the correlated subquery will likely cause the subquery to be run for each row.
  2. In the second example, the subquery should only run once, since not correlated.
  3. In the second example, the IN will short-circuit as soon as it finds a match.
RedFilter
+1  A: 

I'd go with EXISTS over IN, see below link:

SQL Server: JOIN vs IN vs EXISTS - the logical difference

Tanner
+2  A: 

EXISTS will be faster because once the engine has found a hit, it will quit looking as the condition has proved true. With IN it will collect all the results from the subquery before further processing.

keithwarren7
That's a good point. The IN statement requires SQL Server to generate a complete result set, and then create a big IF statement I think.
Randy Minder
This used to be true but in current versions (at least 2008) the optimizer is much smarter... it actually treats IN () just like an EXISTS ().
Aaron Bertrand
+1  A: 

The execution plans are typically going to be identical in these cases, but until you see how the optimizer factors in all the other aspects of indexes etc., you really will never know.

Cade Roux
A: 

I've done some testing on SQL Server 2005 and 2008, and on both the EXISTS and the IN come back with the exact same actual execution plan, as other have stated. The Optimizer is optimal. :)

Something to be aware of though, EXISTS, IN, and JOIN can sometimes return different results if you don't phrase your query just right: http://weblogs.sqlteam.com/mladenp/archive/2007/05/18/60210.aspx

Adam Nofsinger