tags:

views:

649

answers:

13

I don't think it makes any difference to the database, but when joining tables, which order do you prefer to write the condition:

SELECT
    ...
    FROM AAA
        INNER JOIN BBB ON AAA.ID=BBB.ID
    WHERE ...

OR

SELECT
    ...
    FROM AAA
        INNER JOIN BBB ON BBB.ID=AAA.ID
    WHERE ...
+2  A: 

I prefer the first option. You're going from A to B, so the order of the fields seems more appropriate.

nickd
+5  A: 

I prefer the second option where the most recently written table comes first.

I think Linq requires it to be the other-way-round, though (option 1).

Michael Haren
Apparently I'm the only one...
Michael Haren
@Michael Haren, I prefer A=B, but you are not the only one. I asked this question to see if I was crazy or not, many people I work with use B=A!
KM
I prefer this as well.
CoverosGene
+1 You're not the only one. And yes, LINQ requires the second format.
Gordon Bell
+4  A: 

I always do

From TABLE_A A

JOIN TABLE_B B ON A.Column = B.Column
Kevin
+1  A: 

No difference, but I'd go for "AAA INNER JOIN BBB ON AAA.ID=BBB.ID" for clarity (but with aliases)

gbn
+1  A: 

I use the first syntax AAA.ID = BBB.ID. It makes code easier to read in my opinion as the join participants follow the table order.

ekoner
+2  A: 

I don't think it really makes a difference, but I prefer

INNER JOIN BBB ON AAA.ID=BBB.ID

because it is consistent with linq.

Eldila
+2  A: 

Another mild voice for AAA.ID = BBB.ID. It seems to make more sense to me, but it doesn't really matter.

On a tangentially related note, however, I've recently begun questioning how I write equality tests. I've always preferred:

If ValueInQuestion = TestValue Then
...

That is,

If fullMoonsThisMonth = 2 Then
...

In others' code I've frequently seen this test reversed, and it bugged me for a while. I came to realize that my preference is based soleley on which formulation sounds "better" in English, and that there is sometimes a good reason for putting the invariant vlaue on the left. In languages with only one operator for both equality testing and assignment (such as VB, in case you didn't recognize the samples...), the compiler will then stop you from accidentally making an assignment when you meant to do a test.

RolandTumble
+1  A: 

I tend to use both as it makes no difference at all.

HLGEM
Except if using LINQ
Gordon Bell
+4  A: 

I prefer the second example (B=A) because in the join I am listing the criteria the determines which B rows should be included. In other words, I want all rows from B where "X" is true of B. This is also consistent when I need to check for criteria beyond just FKs. For example:

SELECT
     some_columns
FROM
     Table_A A
INNER JOIN Table_B B ON
     B.a_id = A.a_id AND
     B.active = 1

In my opinion it wouldn't have the same readability if I had:

1 = B.active

Also consider the cases where you're join criteria includes more than one table:

SELECT
     some_columns
FROM
     Table_A A
INNER JOIN Table_B B ON
     B.a_id = A.a_id AND
     B.active = 1
INNER JOIN Table_C C ON
     C.a_id = A.a_id AND
     C.b_id = B.b_id AND
     C.category = 'Widgets'

To me that makes it very clear as to the criteria on which rows from C should be included.

Tom H.
This is exactly why I prefer B=A format, I feel like I'm "explaining" how B is being included.
Chad Birch
Tom states the rationale for the preference better than I did in my answer. Basically, the preference for the latter pattern emerges in more complex joins, when joining multiple tables, inline views, on multiple predicates. It's habit to follow the same pattern for the simpler cases as well, as in the example in the original question, joining two tables with one predicate.
spencer7593
+2  A: 

It doesn't really matter, both are correct. My preference is for the second.

My preference is based on the idea that table BBB is the table I'm adding into the result set, and the job at hand is tying columns (expressions) from the new table BBB to other columns already in the result set. It may make more sense in a different example:

SELECT ...
  FROM AAA a
  JOIN BBB b ON (b.AAA_ID = a.ID)
  JOIN CC c ON (c.AAA_ID = b.AAA_ID AND UPPER(c.FEE) IN ('FI','FO'))
  JOIN DDD d ON (d.CC_ID = c.ID AND LEFT(d.DAH,2) = c.FEE)

Yes, this is an arbitarily complex example, but sometimes real code does get this complicated. When referencing multiple predicates in the join condition, I find it helpful when each predicate references first (on the left side) expressions from the table most recently joined.

There are other patterns that help as well, for example, when the primary key of each table is a single column named "ID" and foreign key columns are typically named PARENTTABLE_ID, such that when I see a construct like a.ID=b.ID, what I'm seeing is a pattern for a primary key joined to a primary key (a one-to-one relationship, which is not the normative pattern). And when I see b.FOREIGN_ID = c.FOREIGN_ID, what I'm seeing is a foreign key being joined to a foreign key. Again, not the usual pattern, indicating this may be a many-to-many join, or maybe a shortcut join for performance. The usual pattern I'm looking for in a parent-child join is like child.PARENT_ID = parent.ID

These patterns aren't right or wrong, just a preference. I find that these patterns don't make code that is right look pretty, but does make code that is "odd" stand out.

spencer7593
Tom H. did a better job explaining it than I did, I concur with his answer, for the reasons he gives. If it was just the simple case, it doesn't matter.The preference is based on following the same pattern that we find helpful with more complicated cases.
spencer7593
A: 

It doesn't matter which order you write the join in as long as the tables you're referencing have already been mentioned in the query.

I personally prefer to list the most recent table second (option 1). This convention helps because I have decided to always use a LEFT OUTER JOIN when needed (rather than a RIGHT OUTER JOIN) and I don't have to think about which table is going to be on the right or the left.

Mike
I'm a bit confused by you answer. The order of the tables in the FROM clause (and the result set you want) determines whether an outer join is specified as LEFT or RIGHT. The order of expressions in the join predicate are independent of outer join.
spencer7593
A: 

Let's remember here that LINQ is not SQL, so why is it even being mentioned here? SQL questions should NOT be answered with LINQ answers. That is just odd and quite irrelevant in my opinion!!

Both examples are acceptable. I prefer the second example, as most SQL developers would. It also depends on the type of SQL JOIN you want to use. The question title is simply for "JOIN" but your example uses an INNER JOIN.

In this instance, for an INNER JOIN, it does not matter. But keep in mind, that both examples will produce a HASH MATCHED JOIN, which is not ideal when dealing with indexes. A Loop Join is much more efficient. Just make sure you consider your indexes as Hash Joins are an indicator of inefficient indexing.

Devtron
A: 

Couldn't it make a performance difference depending on how the two tables are indexed and on what field they are joined?

The query optimizer in some DB engines might be able to do the right thing regardless of what order you specify, but only some time with the query plan and testing would answer for sure.

jmanning2k