views:

326

answers:

4

I should probably know this by now, but what, if any is the difference between the two statements below?

The nested join:

SELECT
    t1.*
FROM
    table1 t1
    INNER JOIN table2 t2
     LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
    ON t2.table2_ID = t1.table1_ID

The more traditional join:

SELECT
    t1.*
FROM
    table1 t1
    INNER JOIN table2 t2 ON t2.table2_ID = t1.table1_ID
    LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
+3  A: 

For your specific example, I don't think there should be any difference in the query plans generated, but there's definitely a difference in readability. Your 2nd example is MUCH easier to follow.

If you were to reverse the types of joins in the example, you could end up with much different results.

SELECT    t1.*
FROM    table1 t1
    LEFT JOIN table2 t2 ON t2.table2_ID = t1.table1_ID
    INNER JOIN table3 t3 ON t3.table3_ID = t2.table2_ID

-- may not produce the same results as...

SELECT    t1.*
FROM    table1 t1
    LEFT JOIN table2 t2
        INNER JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
    ON t2.table2_ID = t1.table1_ID

Based on the fact that order of the joins DOES matter in many cases - careful thought should go into how you're writing your join syntax. If you find that the 2nd example is what you're really trying to accomplish, i'd consider rewriting the query so that you can put more emphasis on the order of your joins...

SELECT    t1.*
FROM    table2 t2
        INNER JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
        RIGHT JOIN table1 t1 ON t2.table2_ID = t1.table1_ID
Scott Ivey
+4  A: 

Well, it's the order of operations..

SELECT
    t1.*
FROM
    table1 t1
    INNER JOIN table2 t2
        LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
    ON t2.table2_ID = t1.table1_ID

could be rewritten as:

SELECT
    t1.*
FROM
       table1 t1                                                       -- inner join t1
    INNER JOIN 
       (table2 t2 LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID)  -- with this 
    ON t2.table2_ID = t1.table1_ID                                     -- on this condition

So basically, first you LEFT JOIN t2 with t3, based on the join condition: table3_ID = table2_ID, then you INNER JOIN t1 with t2 on table2_ID = table1_ID.

In your second example you first INNER JOIN t1 with t2, and then LEFT JOIN the resulting inner join with table t3 on the condition table2_ID = table1_ID.

SELECT
    t1.*
FROM
    table1 t1
    INNER JOIN table2 t2 ON t2.table2_ID = t1.table1_ID
    LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID

could be rewritten as:

SELECT
    t1.*
FROM
        (table1 t1 INNER JOIN table2 t2 ON t2.table2_ID = t1.table1_ID) -- first inner join
    LEFT JOIN                                                           -- then left join
        table3 t3 ON t3.table3_ID = t2.table2_ID                        -- the result with this

EDIT

I apologize. My first remark was wrong. The two queries will produce the same results but there may be a difference in performance as the first query may perform slower than the second query in some instances ( when table 1 contains only a subset of the elements in table 2) as the LEFT JOIN will be executed first - and only then intersected with table1. As opposed to the second query which allows the query optimizer to do it's job.

Miky Dinescu
Clearly the order of operations is different, but is there any actual difference (performance, results, anything)?
Roee Adler
Please give an example of data where the queries would return different results. I was not able to generate any.
Shannon Severance
Ok, thanks for the thoughts on this guys. Its pretty much just like I thought then. The results will be the same, but the queries may perform differently. I think for ease of use, I'll change the old (nested) query to use the more traditional syntax.
TehOne
+2  A: 

The best way to see what is different in these two queries is to compare the Query Plan for both these queries.

There is no difference in the result sets for these IF there are always rows in table3 for a given row in table2.

I tried it on my database and the difference in the query plans was that 1. For the first query, the optimizer chose to do the join on table2 and table 3 first. 2. For the second query, the optimizer chose to join table1 and table2 first.

Ralph Wiggum
For me, the execution plans for my real queries were exactly the same.
TehOne
A: 

You should see no difference at all between the two queries, provided your DBMS' optimizer is up to scratch. That, however, even for big-iron, high-cost platforms, is not an assumption I'd be confident in making, so I'd be fairly unsurprised to discover that query plans (and consequently execution times) varied.

Mike Woodhouse