views:

143

answers:

6

Today I got into a debate with my project manager about Cartesian products. He says a 'natural join' is somehow much better than using 'select from where' because the later cause the db engine to internally perform a Cartesian product but the former uses another approach that prevents this. As far as I know, the natural join syntax is not any different in anyway than 'select from where' in terms of performance or meaning, I mean you can use either based on your taste.

SELECT * FROM table1,table2 WHERE table1.id=table2.id
SELECT * FROM table1 NATURAL JOIN table2

please elaborate about the first query causing a Cartesian product but the second one being somehow more smart

+1  A: 

http://stackoverflow.com/questions/121631/inner-join-vs-where

Alexander
So basically, it depends on how *smart* the query optimizer is. Probably smart enough.
MvanGeest
When I was first learning SQL, I wasn't using joins. But then, after realizing the elasticity of the query when using joins (left, right, inner, outer), I just started writing everything with joins. It's simply faster to edit from one type to another.
Alexander
A: 

Performance-wise, there is no difference. Its been discussed over and over and over again. Googling for "join syntax oracle vs where" results in several good articles, including the one on this site referenced by Alexander.

However, be careful using a NATURAL JOIN. It will pick up on common columns like createdate or createuser or such that you normally really don't care about joining on and may cause problems. I highly recommended against NATURAL JOIN in production...just use INNER JOIN and specify the columns.

Even Tom agrees.

rfusca
+1  A: 

First thing to point out is that database optimizers interpret syntax in their own way. Obviously each product varies but I would be frankly astonished if any DBMS penalised what is the commonest mechanism for joining tables.

With regards to terminology, it is a cross join which generates a cartesian product. That is different from an inner join, and would generate a different result set.

Finally, natural joins are horrible, literally bugs waiting to happen. They should be avoided by all right-thinking people.

APC
+6  A: 

The correct way should be explicit with filters and joins separated

SELECT * FROM table1 JOIN table2 ON table1.id = table2.id

NATURAL JOINS may be easy and "clean" but more likely to be utterly unpredictable...

gbn
+1: The more obvious things are, the easier they are to fix or improve.
OMG Ponies
+1 Mercifully, a good number of DB vendors do not support Natural Joins. IMO, explicit joins are easier for the reader as they are not left with the mystery of determining the joining columns.
Thomas
In particular, say you have a second column like "is_active" that is common to both tables. That would be included in a natural join, and you'd likely get fewer rows than you were expecting.
Justin K
+2  A: 

It depends.

A natural join links all columns in two tables with the same name. If the only two columns in tables 1 and 2 with the same name are ID, then the two queries should be evaluated identically by the optimiser; on the other hand, if there are more than two columns with the same name (or none at all) in the two tables, a completely different query is being performed.

In any case, a cartesian product will almost invariably (I'm tempted to say always) perform worse than any other type of join, as it joins every record of one table with every record of the other table.

How good is your manager at distinguishing his gluteus maximus from the upper end of his ulna?

Mark Bannister
The first query is not really cartesian: just a JOIN in the WHERE clause. I agree with your sentiments about anatomical knowledge though
gbn
Perhaps I should have said explicitly that a cartesian join will never occur for the first query, and only for the second if there are **no** matching columns in the two tables.
Mark Bannister
+1  A: 

I would not use either syntax. Your query indicates an inner join, I would use the explicit syntax for that. You should not be using implied joins ever, they are subject to misinterpretation (was that an accidental cross join or did you mean to do that?) and accidental cross joins. Would you use C# code that was replaced 18 years agao with a better syntax (well actually C# didn't exist 18 years ago, but I think you understand what I'm saying)? Then why are you using outdated SQL code?

Not only is the implied join a problem for maintenance but it can be a big problem if you try to use the implied join syntax for outer joins as that does not work correctly in some databases and is also deprecated in at least one database, SQL Server, I know. And if you have the need for a filter on the table in the left join, you can't do that with the implied syntax at all becasue it will convert it to an innner join.

Yes your code works but it is a poor technique and you should get used to using the inner join explicitly, so that you are making your intent clear to furture maintainers and so you don't create accidental problems as you write more complex queries. If using the explicit syntax is not second nature for you, you will really struggle when you need to use if for something more complicated.

I have never in 30 years of querying databases seen a need to write a natural join and had to look up what one was, so using that is not more clear than the implied join.

HLGEM