ansaurus

Question

Answer 1

+3 A:

Starting from a certain number of records, the IN predicate over a SELECT becomes faster than that over a list of constants.

See this article in my blog for performance comparison:

Passing parameters in MySQL: IN list vs. temporary table

If the column used in the query in the IN clause is indexed, like this:

SELECT  *
FROM    table1
WHERE   unindexed_column IN
        (
        SELECT  indexed_column
        FROM    table2
        )

, then this query is just optimized to an EXISTS (which uses but a one entry for each record from table1)

Unfortunately, MySQL is not capable of doing HASH SEMI JOIN or MERGE SEMI JOIN which are yet more efficient (especially if both columns are indexed).

Quassnoi 2009-10-08 13:08:04

That blog article is really useful, thanks.

Roy 2009-10-08 13:26:57

This has really helped me out too. Good article.

Coder 42 2009-11-07 13:13:56

Answer 2

A:

Why do you extract the ids first? You should probably just join the tables. If you use the ids for something else, you can insert them in a temp table before and use this table for the join.

Eric Hogue 2009-10-08 13:12:35

Yep, you're probably right. I do the extract first because the extract query is really complicated (lots and lots of maths, some subqueries etc) and my tiny brain couldn't work out how to do the join at the same time.. really just wondering if I should be putting that refactor near the top of my to-do list or not!

Roy 2009-10-08 13:21:04

Then you should probably put them in a temp table. That would be simpler than getting them and building the in clause. And has Quassnoi says it would be faster.

Eric Hogue 2009-10-08 13:40:16

ansaurus

tags:

views:

answers:

Performance of MYSQL "IN"

related questions