ansaurus

Question

Which will be faster out of these two queries?

Answer 1

+5 A:

They are the same.

Edit: They are the same in MySQL.

using an index:

mysql> explain select * from trees where id = 1 or id = 2;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | trees | range | PRIMARY       | PRIMARY | 8       | NULL |    2 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

mysql> explain select * from trees where id in (1,2);
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | trees | range | PRIMARY       | PRIMARY | 8       | NULL |    2 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

using a full table scan:

mysql> explain select * from trees where version = 1 or version = 2;
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | trees | ALL  | NULL          | NULL | NULL    | NULL |    7 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+

mysql> explain select * from trees where version in (1,2);
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | trees | ALL  | NULL          | NULL | NULL    | NULL |    7 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+

cherouvim 2009-12-10 21:01:10

That seems unlikely.

Robert Harvey 2009-12-10 21:01:52

I thought `IN (whatever)` just translated out to multiple `OR` tests, so it seems like they would be the same.

Sarah Vessels 2009-12-10 21:03:02

Very unlikely, indeed.

Romain 2009-12-10 21:03:06

It doesn't seem unlikely when it comes to execution plan. It depends on how DBMS decides to optimize the query.

Cătălin Pitiș 2009-12-10 21:03:07

E.g. sql server, the in clause does get to translated a series of ORs

AdaTheDev 2009-12-10 21:06:42

Saying that it seems unlikely isn't a very helpful comment. +1 for post as these do come out the same in MySQL which we believe the question was directed towards.

Seth 2009-12-10 21:27:06

So wouldn't the IN statement be a fraction slower than the OR's since the db engine would need to make the translation?

Adam 2009-12-10 22:13:22

@Adam: that is a micro(nano, pico)fraction of processing time, and the engine will anyway cache the query after the first execution. It shouldn't really make any difference.

cherouvim 2009-12-11 06:57:02

@Adam: Technically the optimiser doesn't translate 'one into the other'. Both queries are translated into a the same 'internal plan'.

Craig Young 2009-12-11 07:58:18

Answer 2

+2 A:

It depends on the interpretation of the database vendor and how they represent the where statements.

monksy 2009-12-10 21:02:46

Answer 3

+2 A:

In Microsoft SQL Server, I believe they are interpreted in exactly the same way.

Paul 2009-12-10 21:04:02

Answer 4

+2 A:

likely or not, but true... Most databases convert the former syntax to the latter. Why wouldn't it? They are logically and functiuonally identical.

See here

Charles Bretana 2009-12-10 21:04:10

Because there could be faster ways of doing this depending on how data is internally represented and how the engine is designed. The difference would, still, likely be minimalistic.

Romain 2009-12-10 21:06:26

No matter how data is represented internally, these two queries are semantically, logically, and functionally identical. No matter how many ways the processor might be able to do them, it can do either of them in any of the many ways it could do the other one.

Charles Bretana 2009-12-10 21:14:55

To make an absolutely identical analogy, it would be as if the language allowed you to either say "Where ColumnName = 'bob'" or "Where ColumnName equals 'bob'" -------- or, and this IS the case in many databases, "Where ColumnName != 'Bob'" Or "Where ColumnName <> 'bob'"

Charles Bretana 2009-12-10 21:22:18

Answer 5

+3 A:

With most RDBMSs the queries are processed through an optimizer before execution. And since your predicates are logically equivalent i would expect them to execute at the same speed because they will be optimized to execute in the same manner.

But it is db dependent. You didn't specify which engine you're using.

Paul Sasik 2009-12-10 21:05:11

Answer 6

+2 A:

Run EXPLAIN SELECT ... on both.

el.pescado 2009-12-10 21:07:48

+1. Find out for sure if they are different or the same.

David Oneill 2009-12-10 21:09:10

Answer 7

A:

In this case the queries are equivalents or the first one will be a little faster in a test. Certain optimizations are applicable to comparisons that use the IN operator from a subquery which is not the case here.

Elzo Valugi 2009-12-10 21:09:55

Answer 8

+1 A:

The correct answer is: depends on the engine. Use EXPLAIN to check what's going on.

Some databases will translate IN form to a series of ORed clauses. Others will not.

Postgres for example gives a lower (!) cost when using the IN form.

I'd personally go for a first form as (1) some engines may optimize it, (2) it's more intuitive, (3) is shorter.

Somewhat offtopic: please don't be tempted to overuse IN for joining the tables though. I've seen many times constructs like this:

SELECT * FROM employee WHERE boss_id IN (SELECT id FROM boss WHERE some_condition..)

I guess the reason for it is that somebody goes with the outer select, then finds out the condition in the referenced table by issuing the inner select, then just does copy & paste. Of course such construct is worse than the proper JOIN.

Wojciech Kaczmarek 2009-12-10 21:37:22

And why could an optimizer not flatten that join?

tster 2009-12-10 21:53:19

I don't know why, ask Oracle.. ;)But seriously, I've seen many explain plans of sub(sub)queries and some were optimized, but most not. I sometimes use this style for throwaway queries, still I could say horror stories how such code made its way to production and ruined some of our admin man-days. But that kind of stories rather belongs to our sister serverfault..Just quite a bad habit to use such queries when you don't have to, don't you agree?

Wojciech Kaczmarek 2009-12-11 05:16:32

ansaurus

tags:

views:

answers:

Which will be faster out of these two queries?

related questions