views:

71

answers:

3
SELECT ... WHERE COL IN(A,B) 

or 

SELECT ... WHERE (COL = A or COL = B)

I'm trying to find out what are the differences between the two constructs?
Would there be significant performance gains either way if utilized on resultsets that are nearing the 1 million mark?

+7  A: 
mysql> describe select * from users where id = 1 or id = 2;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | users | range | PRIMARY       | PRIMARY | 4       | NULL |    2 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.01 sec)

mysql> describe select * from users where id in (1,2);
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | users | range | PRIMARY       | PRIMARY | 4       | NULL |    2 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

Judging from DESCRIBE statement's output - they are identical.

Eimantas
To compound on this, the query optimizer should reduce these two simple examples to exactly the same query.If your example is more complex, you'll want to use DESCRIBE/EXPLAIN on the query to get info about how things are being executed.
AvatarKava
Do you **know** that the same query plan is chosen for a query on the user's table, as will be chosen on a table containing millions of records? I'm not sure your exmple convinces me.
lexu
I hope (but without much hope) that the DESCRIBE wasn't done on an empty table, signifying nothing.
le dorfier
These queries were run on a table with 340k records.
Eimantas
@Eimantas: thanks for the clarification!
lexu
+1  A: 

There is no performance difference between using

col = A or col = B

and

col IN (8,7,5,2,....)

The mysql range optimizer optimizes the IN query by computing a sorted list of (8,7,5,2,....) which is then used to construct the corresponding SEL_TREE.

So there is no performance difference as both would be doing a range scan on the index.

ovais.tariq
A: 

Just as a small example:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.1.41-community-log MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SET @qwe=5;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT BENCHMARK(100000000, @qwe IN (10,5));
+--------------------------------------+
| BENCHMARK(100000000, @qwe IN (10,5)) |
+--------------------------------------+
|                                    0 |
+--------------------------------------+
1 row in set (3.52 sec)

mysql> SELECT BENCHMARK(100000000, @qwe = 10 OR @qwe = 5);
+---------------------------------------------+
| BENCHMARK(100000000, @qwe = 10 OR @qwe = 5) |
+---------------------------------------------+
|                                           0 |
+---------------------------------------------+
1 row in set (5.91 sec)

And:

mysql> SELECT BENCHMARK(100000000, @qwe IN (10,1,9,2,8,3,7,4,6,5));
+------------------------------------------------------+
| BENCHMARK(100000000, @qwe IN (10,1,9,2,8,3,7,4,6,5)) |
+------------------------------------------------------+
|                                                    0 |
+------------------------------------------------------+
1 row in set (6.02 sec)

mysql> SELECT BENCHMARK(100000000, @qwe = 10 OR @qwe = 1 OR @qwe = 9
    ->   OR @qwe = 2 OR @qwe = 8 OR @qwe = 3 OR @qwe = 7 OR @qwe = 4
    ->   OR @qwe = 6 OR @qwe = 5) as result;
+--------+
| result |
+--------+
|      0 |
+--------+
1 row in set (20.20 sec)

Mind the duration in parentheses.

newtover