ansaurus

Question

Answer 1

+2 A:

The first query is impossible and will never return a result set. It's basically saying "Give me all the records in the table where color_id is 1 AND color_id is 2" which can never happen.

If you want to ask the difference between

SELECT DISTINCT a.id 
  FROM RelationshipTable as a 
  JOIN RelationshipTable as b ON b.id = a.id 
 WHERE a.color_id = 1 
   OR b.color_id = 2;

versus

SELECT DISTINCT color_id 
  FROM RelationshipTable 
 WHERE color_id = 1 
    OR color_id = 2;

In this case the first will always be slower than the second for large tables. The first results in a full table scan for table A while the second one uses the indexes that should be used in the where clause.

methodin 2010-10-20 18:05:35

This is not correct. I neglected to mention that id is NOT a unique key. Therefore I'm going to get all iterations of the combinations of duplicate ids, making a table where color_id's being non-identical possible.

Matt Dunnam 2010-10-20 18:19:41

Answer 2

+1 A:

The two are not the same query and should not be giving the same result set. In the first query you want all the records which meet both conditions, you have a record with a color_id = of 1 and a record with a color_id of 2 for the same ID. In the second query you will get all records that have both color ids and all records that have only one or the other. Of course since you are asking for a differnt field to be returned you might not see this. And the second query is somewhat silly anyway as it can be expressed as:

select 1 as color id 
union all
select 2

And never hit a table at all. That would make it super fast.

HLGEM 2010-10-20 18:40:07

ansaurus

tags:

views:

answers:

Why are self joins faster than or?

related questions