views:

724

answers:

6

I have a relatively simple query joining two tables. The "Where" criteria can be expressed either in the join criteria or as a where clause. I'm wondering which is more efficient.

Query is to find max sales for a salesman from the beginning of time until they were promoted.

Case 1

select salesman.salesmanid, max(sales.quantity)
from salesman
inner join sales  on salesman.salesmanid =sales.salesmanid 
                  and sales.salesdate < salesman.promotiondate
group by salesman.salesmanid

Case 2

select salesman.salesmanid, max(sales.quantity)
from salesman
inner join sales  on salesman.salesmanid =sales.salesmanid 
where sales.salesdate < salesman.promotiondate
group by salesman.salesmanid

Note Case 1 lacks a where clause altogether

RDBMS is Sql Server 2005

EDIT If the second piece of the join criteria or the where clause was sales.salesdate < some fixed date so its not actually any criteria of joining the two tables does that change the answer.

A: 

It may seem flippant, but the answer is whichever query for which the query analyzer produces the most efficient plan.

To my mind, they seem to be equivalent, so the query analyzer may well produce identical plans, but you'd have to test.

Craig Shearer
+1  A: 

I wouldn't use performance as the deciding factor here - and quite honestly, I don't think there's any measurable performance difference between those two cases, really.

I would always use case #2 - why? Because in my opinion, you should only put the actual criteria that establish the JOIN between the two tables into the JOIN clause - everything else belongs in the WHERE clause.

Just a matter of keeping things clean and put things where they belong, IMO.

Obviously, there are cases with LEFT OUTER JOINs where the placement of the criteria does make a difference in terms of what results get returned - those cases would be excluded from my recommendation, of course.

Marc

marc_s
+1 - good answer marc
Scott Ivey
A: 

Neither is more efficient, using the WHERE method is considered the old way to do so (http://msdn.microsoft.com/en-us/library/ms190014.aspx). YOu can look at the execution plan and see they do the same thing.

Bryan S.
My example wasn't the best change the second part of the join and or where clause to be a fixed date instead of a date in the salesman table.
Gratzy
Ahh okay, well in either case the Execution Plan is your friend. Analyze it to find which one has the best performance. My bet is they'll be identical.
Bryan S.
+1  A: 

I don't think you'll find a finite answer for this one that applies to all cases. The 2 are not always interchangeable - since for some queries (some left joins) you will come up with different results by placing the criteria in the WHERE vs the FROM line.

In your case, you should evaluate both of these queries. In SSMS, you can view the estimated and actual execution plans of both of these queries - that would be a good first step in determining which is more optimal. You could also view the time & IO for each (set statistics time on, set statistics io on) - and that will also give you information to make your decision.

In the case of the queries in your question - I'd bet that they'll both come out with the same query plan - so in this case it may not matter, but in others it could potentially produce different plans.

Try this to see the difference between the 2...

SET STATISTICS IO ON
SET STATISTICS TIME ON

select salesman.salesmanid, 
       max(sales.quantity)
from   salesmaninner join sales on salesman.salesmanid =sales.salesmanid
       and sales.salesdate < salesman.promotiondate
group by salesman.salesmanid

select salesman.salesmanid, 
       max(sales.quantity)
from   salesmaninner join sales on salesman.salesmanid = sales.salesmanid 
where  sales.salesdate < salesman.promotiondate
group by salesman.salesmanid

SET STATISTICS TIME OFF
SET STATISTICS IO OFF
Scott Ivey
This would be specifically inner joins
Gratzy
Gratzy - in that case, I agree with marc_s - the filter criteria should be kept in the WHERE portion of the query, and the join criteria should be kept in the FROM portion of the query.
Scott Ivey
Does it matter though that the where criteria table has many, many more rows. I know that may sound trivial but I was wondering if it then would be better to make it part of the join criteria instead of the where clause
Gratzy
I don't think that should have any impact - but I'd still suggest looking at the plans generated for both queries. In general, i think your code will be easier to read if you keep the filter and join criteria separate. In practice, that might not always be possible - but given identical query plans, i'd always choose the 2nd query for readability.
Scott Ivey
+2  A: 

You can run the execution plan estimator and sql profiler to see how they stack up against each other.

However, they are semantically the same underneath the hood according to this SQL Server MVP:

http://www.eggheadcafe.com/conversation.aspx?messageid=29145383&amp;threadid=29145379

klabranche
A: 

Become familiar with the Estimated Execution Plan in SQL Management Studio!! Like others have said, you're at the mercy of the analyzer no matter what you do so trust its estimates. I would guess the two you provided would produce the exact same plan.

If it's an attempt to change a development culture, pick the one that gives you a better plan; for the ones that are identical, follow the culture

I've commented this on other "efficiency" posts like this one (it's both sincere and sarcastic) -- if this is where your bottlenecks reside, then high-five to you and your team.

Austin Salonen
Yes I am familiar with Estimated Execution Plan. It however will not tell me why it generates the execution plan the way it does. I was looking to understand why one might be more efficient than the other.
Gratzy
@Gratzy: http://www.amazon.com/Gurus-Guide-Server-Architecture-Internals/dp/0201700476
Austin Salonen