views:

51

answers:

3

I have heard joins should be preferred over nested queries. Is it true in general? Or there might be scenarios where one would be faster than other:

for e.g. which is more efficient way to write a query?:

Select emp.salary 
from employee emp    
where emp.id = (select s.id from sap s where s.id = 111)

OR

Select emp.salary     
from employee emp   
INNER JOIN sap s ON emp.id = s.id   
WHERE s.id = 111
A: 

It is much faster (and easier to write) to join two tables on an index than to run two separate queries (even a subquery).

tandu
But why? What's the difference in execution plan?
Kirk Woll
No difference because MySQL will optimize it to probably use the JOIN and do the same thing, but you will see it running a PRIMARY query and a SUBQUERY. Not sure whether this impacts on a speed. Another thing is you are semantically joining the two tables, so why not write this?
tandu
@tandu, I think for a great many queries, subqueries offer more clarity. If you only want one piece of data from the other table, it is *far* more clear for it to be all in one place (with a subquery) than to split it up into the select, from/join, and potentially the where clause of the primary query.
Kirk Woll
Totally disagree. When you JOIN you are asking for the relational product of the two tables which is the rows upon which their index (indeces?) match. With a subquery you are saying, eh, give me all of the rows in table 1 that match this piece of data (an ID? probably one). Now, compare them to all of the values on this other (primary) table where the keys match.
tandu
Wow. You are a rare bird if you think that subqueries are unilaterally less readable than a join.
Kirk Woll
You think so, but I have never met anyone who thinks like you before today..
tandu
A: 

If the queries are logically equivalent, then the query optimizer should be able to make the same (best) execution plan from each one. In that case, query style should support what can be understood the best (that's subqueries for me).

David B
+2  A: 

I have heard joins should be preferred over nested queries. Is it true in general?

It depends on the requirements, and the data.

Using a JOIN risks duplicating the information in the resultset for the parent table if there are more than one child records related to it, because a JOIN returns the rows that match. Which means if you want unique values from the parent table while using JOINs, you need to look at using either DISTINCT or a GROUP BY clause. But none of this is a concern if a subquery is used.

Also, subqueries are not all the same. There's the straight evaluation, like your example:

where emp.id = (select s.id from sap s where s.id = 111)

...and the IN clause:

where emp.id IN (select s.id from sap s where s.id = 111)

...which will match any of the value(s) returned by the subquery when the straight evaluation will throw an error if s.id returns more than one value. But there's also the EXISTS clause...

WHERE EXISTS(SELECT NULL 
               FROM SAP s
              WHERE emp.id = s.id
                AND s.id = 111)

The EXISTS is different in that:

  • the SELECT clause doesn't get evaluated - you can change it to SELECT 1/0, which should trigger a divide-by-zero error but won't
  • it returns true/false; true based on the first instance the criteria is satisfied so it's faster when dealing with duplicates.
  • unlike the IN clause, EXISTS supports comparing two or more column comparisons at the same time, but some databases do support tuple comparison with the IN.
  • it's more readable
OMG Ponies
+1, thanks for thorough answer.
Kirk Woll
@OMG: Do you think there are situations where the `IN` clause is better than the `EXISTS`?
Daniel Vassallo
@Daniel Vassallo: Not at this moment. I wonder the EXISTS approach also circumvents the issue with a high number of values to compare against that IN has on some databases (in the thousands).
OMG Ponies