views:

381

answers:

6

I have query like this:

SELECT `table_1`.* from `table_1`
  INNER JOIN `table_2` [...]
  INNER JOIN `table_3` [...]
WHERE `table_1`.`id` IN(
  SELECT `id` FROM [...]
)
AND [more conditions]

When I use EXPLAIN, there is 'DEPENDENT SUBQUERY' at the end, but I want this subquery to be performed first, before other conditions.

Is is possible?

+1  A: 

You could select the subquery's result into a temporary table first and then join to it in the main query.

Andrew Hare
+2  A: 

The best way is to put the WHERE conditions for table1 into a subquery in the FROM clause. EG:

SELECT `table_1`.* 
FROM (
      SELECT * FROM `table_1` WHERE `table_1`.`id` IN (...)
     )
  INNER JOIN `table_2` [...]
  INNER JOIN `table_3` [...]
WHERE [more conditions]
Chris Baxter
+1  A: 

Unfortunately, no, you cannot. Subqueries are actually run once for each row in the outer query.

I'd actually suggest that you convert this to another join, using table_1.id as your key to the other table.

gms8994
+2  A: 
SELECT  `table_1`.*
FROM    `table_1`
INNER JOIN
        `table_2` [...]
INNER JOIN
        `table_3` [...]
WHERE   `table_1`.`id` IN
        (
        SELECT  `id`
        FROM    [...]
        )
        AND [more conditions]

If the inner table is properly indexed, the subquery here is not being "performed" at all in a strict sense of word.

Since the subquery is a part of an IN expression, the condition is pushed into the subquery and it's transformed into an EXISTS.

In fact, this subquery is evaluated on each step:

EXISTS
(
SELECT  NULL
FROM    [...]
WHERE   id = table1.id
)

You can actually see it in the detailed description provided by EXPLAIN EXTENDED.

That's why it's called DEPENDENT SUBQUERY: the result of each evaluation depends on the value of table1.id. The subquery as such is not correlated, it's the optimized version that is correlated.

MySQL always evaluates the EXISTS clause after the more simple filters (since they are much easier to evaluate and there is a probability that the subquery won't be evaluated at all).

If you want the subquery to be evaluated all at once, rewrite the query as this:

SELECT  table_1.*
FROM    (
        SELECT  DISTINCT id
        FROM    [...]
        ) q
JOIN    table_1
ON      table_1.id = q.id
JOIN    table_2
ON      [...]
JOIN    table_3
ON      [...]
WHERE   [more conditions]

This forces the subquery to be leading in the join, which is more efficient if the subquery is small compared to table_1, and less efficient if the subquery is large compared to table_1.

If there is an index on [...].id used in the subquery, the subquery will be performed using an INDEX FOR GROUP-BY.

Quassnoi
+1  A: 

this is a known bug in mysql: http://bugs.mysql.com/bug.php?id=25926

a useful workaround is to push down the subquery into another select * from (subquery) as dt type subquery.

longneck
The bug you mention affects only the `GROUP BY / DISTINCT` subqueries. An `IN` predicate over a subquery with a `DISTINCT` clause in it is equivalent to a `JOIN`.
Quassnoi
no, the problem also occurs in queries that do not involve `GROUP BY` or `DISTINCT`. disproving this is simple: push the subquery down another level as i suggest and see if the DEPENDANT SUBQUERY changes to a DERIVED table. (and how do you know this query doesn't have GROUP BY? the original poster did not provide the actual query.)
longneck
That workaround worked for me: it took a query that look 32 minutes down to 0.452 seconds. Thanks!
Deebster
A: 

Hi

I tried the soultions above, but it did not help. I replaced my subquery with a select on a view. That helped as the optimizer not sees the subquery as regular join rathen than the subquery it is.

regards

Max