tags:

views:

10419

answers:

10

Duplicate of:

What’s the difference between just using multiple froms and joins?

Most SQL dialects accept both the following queries:

SELECT a.foo, b.foo
FROM a, b
WHERE a.x = b.x

SELECT a.foo, b.foo
FROM a
LEFT JOIN b ON a.x = b.x

Now obviously when you need an outer join, the second syntax is required. But when doing an inner join why should I prefer the second syntax to the first (or vice versa)?

A: 

I hear a lot of people complain the first one is too difficult to understand and that it is unclear. I don't see a problem with it, but after having that discussion, I use the second one even on INNER JOINS for clarity.

Kevin
I was brought up with the habit of not using the JOIN syntax and doing it the first way. I must admit I am still stuck in the habit often times just because I think my brain has been conditioned to follow that logic, wheras the join syntax at times to me seems hard to think in.
TheTXI
I was taught that way too. I changed my coding style, because people would look at it and not easily recognize what was going on. Since there is no logical difference and I can find no reason for choosing the former over the latter, I felt that I should adapt to making the code clearer to help others understand what I write.
Kevin
A: 

To the database, they end up being the same. For you, though, you'll have to use that second syntax in some situations. For the sake of editing queries that end up having to use it (finding out you needed a left join where you had a straight join), and for consistency, I'd pattern only on the 2nd method. It'll make reading queries easier.

Autocracy
A: 

The JOIN syntax keeps conditions near the table they apply to. This is especially useful when you join a large amount of tables.

By the way, you can do an outer join with the first syntax too:

WHERE a.x = b.x(+)

Or

WHERE a.x *= b.x

Or

WHERE a.x = b.x or a.x not in (select x from b)
Andomar
+1 from me for the additional syntax! I never knew that you could do that...
JFV
The *= syntax is deprecated in MS SQLServer and for good reason: Not only does it make it harder to read, but it doesn't do what people think it does and it is NOT the same as a similarly looking LEFT JOIN. The (+) syntax is unfamiliar to me; what SQL implementation does that?
Euro Micelli
The other syntax is used by Oracle, at least.
Lasse V. Karlsen
Never use the SQL Server syntax *=, it will NOT give consistent results as it will sometimes interpret as a cross join not a left join. This is true even as far back as SQL Server 2000. If you have any code using this, you need to fix.
HLGEM
A: 

Well the first and second queries may yield different results because a LEFT JOIN includes all records from the first table, even if there are no corresponding records in the right table.

ghills
A: 

When you need an outer join the second syntax is not always required:

Oracle:

SELECT a.foo, b.foo
  FROM a, b
 WHERE a.x = b.x(+)

MSSQLServer (although it's been deprecated in 2000 version)/Sybase:

SELECT a.foo, b.foo
  FROM a, b
 WHERE a.x *= b.x

But returning to your question. I don't know the answer, but it is probably related to the fact that a join is more natural (syntactically, at least) than adding an expression to a where clause when you are doing exactly that: joining.

Pablo Santa Cruz
SQL server has deprecated that left join syntax and even in SQL Server 2000 it will not consistently give correct results (sometimes it does a cross join instead of a left join) and should never be used in SQL Server.
HLGEM
@HLGEM: Thanks for the info. I am going to UPDATE my post to reflect what you are saying.
Pablo Santa Cruz
+1  A: 

The second is preferred because it is far less likely to result in an accidental cross join by forgetting to put inthe where clause. A join with no on clause will fail the syntax check, an old style join with no where clause will not fail, it will do a cross join.

Additionally when you later have to a left join, it is helpful for maintenance that they all be in the same structure. And the old syntax has been out of date since 1992, it is well past time to stop using it.

Plus I have found that many people who exclusively use the first syntax don't really understand joins and understanding joins is critical to getting correct results when querying.

HLGEM
A: 

Basically, when your FROM clause lists tables like so:

SELECT * FROM
  tableA, tableB, tableC

the result is a cross product of all the rows in tables A, B, C. Then you apply the restriction WHERE tableA.id = tableB.a_id which will throw away a huge number of rows, then further ... AND tableB.id = tableC.b_id and you should then get only those rows you are really interested in.

DBMSs know how to optimise this SQL so that the performance difference to writing this using JOINs is negligible (if any). Using the JOIN notation makes the SQL statement more readable (IMHO, not using joins turns the statement into a mess). Using the cross product, you need to provide join criteria in the WHERE clause, and that's the problem with the notation. You are crowding your WHERE clause, which should only contain RESTRICTIONS to the resultset, with stuff like

    tableA.id = tableB.a_id 
AND tableB.id = tableC.b_id

which only goes to restrict the senseless cross product. Your real restrictions on resultset are then harder to discern. You should definitely use JOINs and keep the FROM clause a FROM clause, and the WHERE clause a WHERE clause.

Peter Perháč
+1  A: 

The SELECT * FROM table1, table2, ... syntax is ok for a couple of tables, but it becomes exponentially (not necessarily a mathematically accurate statement) harder and harder to read as the number of tables increases.

The JOIN syntax is harder to write (at the beginning), but it makes it explicit what criteria affects which tables. This makes it much harder to make a mistake.

Also, if all the joins are INNER, then both versions are equivalent. However, the moment you have an OUTER join anywhere in the statement, things get much more complicated and it's virtually guarantee that what you write won't be querying what you think you wrote.

Euro Micelli
+7  A: 

The old syntax, with just listing the tables, and using the WHERE clause to specify the join criteria, is being deprecated in most modern databases.

It's not just for show, the old syntax has the possibility of being ambiguous when you use both INNER and OUTER joins in the same query.

Let me give you an example.

Let's suppose you have 3 tables in your system:

Company
Department
Employee

Each table contain numerous rows, linked together. You got multiple companies, and each company can have multiple departments, and each department can have multiple employees.

Ok, so now you want to do the follow:

List all the companies, and include all their departments, and all their employees. Note that some companies doesn't have any departments yet, but make sure you include them as well. Make sure you only retrieve departments that have employees, but always list all companies.

So you do this:

SELECT * -- for simplicity
FROM Company, Department, Employee
WHERE Company.ID *= Department.CompanyID
  AND Department.ID = Employee.DepartmentID

Note that the last one there is an inner join, in order to fulfill the criteria that you only want departments with people.

Ok, so what happens now. Well, the problem is, it depends on the database engine, the query optimizer, indexes, and table statistics. Let me explain.

If the query optimizer determines that the way to do this is to first take a company, then find the departments, and then do an inner join with employees, you're not going to get any companies that doesn't have departments.

The reason for this is that the WHERE clause determines which rows ends up in the final result, not individual parts of the rows.

And in this case, due to the left join, the Department.ID column will be NULL, and thus when it comes to the INNER JOIN to Employee, there's no way to fulfill that for this row, and so it won't appear.

On the other hand, if the query optimizer decides to tackle the department-employee join first, and then do a left join with the companies, you will see them.

So the old syntax is ambiguous. There's no way to specify what you want, without dealing with query hints, and some databases has no way at all.

Enter the new syntax, with this you can choose.

For instance, if you want all companies, as the problem description stated, this is what you would write:

SELECT *
FROM Company
     LEFT JOIN (
         Department INNER JOIN Employee ON Department.ID = Employee.DepartmentID
     ) ON Company.ID = Department.CompanyID

Here you specify that you want the department-employee join to be done as one join, and then left join the results of that with the companies.

Additionally, let's say you only want departments that contains the letter X in their name. Again, with old style joins, you risk loosing the company as well, if it doesn't have any departments with an X in its name, but with the new syntax, you can do this:

SELECT *
FROM Company
     LEFT JOIN (
         Department INNER JOIN Employee ON Department.ID = Employee.DepartmentID
     ) ON Company.ID = Department.CompanyID AND Department.Name LIKE '%X%'

This extra clause is used for the joining, but is not a filter for the entire row. So the row might appear with company information, but might have NULL's in all the department and employee columns for that row, because there is no department with an X in its name for that company. This is hard with the old syntax.

This is why, amongst other vendors, Microsoft has deprecated the old outer join syntax, but not the old inner join syntax, since SQL Server 2005 and upwards. The only way to talk to a database running on Microsoft SQL Server 2005 or 2008, using the old style outer join syntax, is to set that database in 8.0 compatibility mode (aka SQL Server 2000).

Additionally, the old way, by throwing a bunch of tables at the query optimizer, with a bunch of WHERE clauses, was akin to saying "here you are, do the best you can". With the new syntax, the query optimizer has less work to do in order to figure out what parts goes together.

So there you have it.

LEFT and INNER JOIN is the wave of the future.

Lasse V. Karlsen
+1  A: 

The first way is the older standard. The second method was introduced in SQL-92, http://en.wikipedia.org/wiki/SQL. The complete standard can be viewed at http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt .

It took many years before database companies adopted the SQL-92 standard.

So the reason why the second method is preferred, it is the SQL standard according the ANSI and ISO standards committee.

Dwight T