ansaurus

Question

SQL Join Types and Performance: Cross vs Inner

Answer 1

+3 A:

Your first example is normally called an explicit join and the second one an implicit join. Performance-wise, they should be equivalent, at least in the popular DBMSes.

Daniel Vassallo 2010-07-15 03:40:28

So, the database engine knows enough to order however many joins you're performing in the way that produces the minimally-sized intermediate subset at each step?

Borealid 2010-07-15 03:44:23

As far as I know, inner joins can be considered commutative, and you can list them in any order and you will get the same results. The query optimizer will internally determine the ideal order of the joins based on various heuristics.

Daniel Vassallo 2010-07-15 03:49:20

Yes, you will get the same results, but the performance difference can be crazy. For instance, if you have a table which matches none of your join criteria, you will always end up with an empty set, but if you do a bunch of work *before* including that table...

Borealid 2010-07-15 03:51:22

I might be wrong, but I think the query optimizer should deal with that in normal circumstances. Do you have any experience with the issue you are mentioning? (ie a different order of joins causes a performance difference)

Daniel Vassallo 2010-07-15 03:53:31

Yes, in a database design class I took a while back, we sped up queries by 100x and sometimes more by reordering the joins. How would the query optimizer know what to do? It would have to evaluate the conditions to know what size they'd make the set... And the conditions can even be aggregation functions!

Borealid 2010-07-15 03:59:42

@Borealid: Let's see what the other sql experts have to say. In the meantime, I found this answer on SO which is related to this topic: http://stackoverflow.com/questions/228424/in-what-order-are-mysql-joins-evaluated/228468#228468

Daniel Vassallo 2010-07-15 04:05:49

Do note that we weren't using MySQL for the class - I don't know on what software our queries were running. It could have been a custom no-optimization engine, since the whole point was to talk about set theory -_-.

Borealid 2010-07-15 04:11:00

@Borealid: That would explain a lot then :) Personally, I never had to optimize my queries by reordering the joins. I always left it up to the query optimizer... If there are really cases in popular DBMSes where the order of joins can effect performance, I'd expect that to happen in some remote edge case.

Daniel Vassallo 2010-07-15 04:15:41

On Oracle 6 and previous the order in which you wrote the joins, particularly with outer joins but also inner, have an extremely strong impact on how the query was constructed and ran. Orders of magnitude were common. However this progressively went away with later versions and I'd be surprised if it made much difference in a modern system, if the statistics are up to date that is.

Cruachan 2010-07-15 09:32:34

Answer 2

+1 A:

Re-ordering of inner-join criteria is extremely easy for the optimizer to do, and there should be very little chance of it messing that up - but if statistics are out of date, all bets are off, it may re-order them to use a table with bad statistics first. But of course that may affect you even if you chose the order.

At least in SQL Server, the optimizer can often even push inner join criteria down through views and inline table-valued functions so that they can be highly selective as early as possible.

Cade Roux 2010-07-15 04:57:53

Answer 3

+1 A:

I think most 'SQL experts' would write the query more like this:

SELECT * 
  FROM t1
       INNER JOIN t2 
         ON t1.t2_id = t2.t1_id 
 WHERE t1.foo='bar'
       AND t2.bar = 'baz';

Specifically:

have strong preference for the INNER JOIN syntax (though may choose to omit the INNER keyword);
put only the 'join' predicates in the JOIN clause;
put the 'filter' predicates in the WHERE clause.

The difference between a 'join' search condition and a 'filter' join condition is subjective but there is much consensus among practitioners.

P.S. what you call a 'cross join' isn't :) As you say, the two queries are equivalent (both 'logical' inner joins, if you will) but the one that doesn't use the explicit [INNER] JOIN syntax uses what is known as infixed notation.

onedaywhen 2010-07-15 09:26:18

ansaurus

tags:

views:

answers:

SQL Join Types and Performance: Cross vs Inner

related questions