ansaurus

Question

MySQL "in" statement

Answer 1

+2 A:

Becuase the subquery is being performed once for every row in the enclosing query, and the join is only performed once.

Databases are optimised for set based queries, so most of the time joins will be faster than subqueries.

You are certainly not the only one to experience this (here is one example). Looks like the query optimizer for MySql doesn't optimise subqueries of this type.

Oded 2010-01-14 10:21:03

When joining tables does it not do the same thing?

Tim 2010-01-14 10:22:45

When doing a join you get the corresponding rows from both tables. When using a subquery, the second table can be fully re-queried for each row.

Oded 2010-01-14 10:25:04

Tim, you're right that they're equivalent. And some other database engines can perform subquery-to-join transformations on queries when the query is compiled/prepared. That would change your 2nd query into the 1st. I know DB2 does this, so probably Oracle, MS SQL Server, etc. PostgreSQL might too.

Harold L 2010-01-14 10:31:21

I can confirm that SQL Server does this.

Quick Joe Smith 2010-01-14 10:37:10

It is a classic; see MySQL Bug #9090: http://bugs.mysql.com/bug.php?id=9090. Problem is that some light bulb has decided that it is the correct behaviour.

Jacco 2010-01-14 10:39:47

Answer 2

+1 A:

In response to Tim's comment, no joins are very different.

Let us say that table a has 10,000 rows, and table b has 1,000 rows. The subquery approach runs the inner query for every row of the outer query. That results in 10,000 x 1,000 = 10,000,000 rows to process.

Depending on the fields involved in the join (and whether they are indexed), the join may only have to pass over each table once, resulting in ~11,000 rows processed. It may even only have to read each row of the smaller table and discard the remainder of the larger table resulting in even fewer reads.

It all depends on the join algorithm and the database engine.

Quick Joe Smith 2010-01-14 10:36:04

So if I want to delete records from a table based on query in another table, is there a faster way than doing "delete from table1 where x in (select y from table2)"

Tim 2010-01-14 10:38:41

I can't say for sure in MySQL. The only alternative I can think of is (for complex conditions) to insert the delete candidates into a temporary table and do a "delete from table1 where x in (select x from tmptbl)".

Quick Joe Smith 2010-01-14 10:56:09

@Tim: Yes mysql has MULTI DELETE support with basic join capabilities, ex: delete f.* from foo f, bar b where f.id=b.id;

ggiroux 2010-01-14 17:36:22

ansaurus

tags:

views:

answers:

MySQL "in" statement

related questions