ansaurus

Question

Answer 1

+1 A:

That's rather hard to say - in order to really find out which one works better, you'd need to actually profile the execution times.

As a general rule of thumb, I think if you have indices on your foreign key columns, and if you're using only (or mostly) INNER JOIN conditions, then the JOIN will be slightly faster.

But as soon as you start using OUTER JOIN, or if you're lacking foreign key indexes, the IN might be quicker.

Marc

marc_s 2009-07-29 13:35:02

I was thinking this too... because it seems JOIN is a more common case and would more likely be optimized

Polaris878 2009-07-29 13:49:43

Answer 2

+4 A:

Funny you mention that, I did a blog post on this very subject.

See Oracle vs MySQL vs SQL Server: Aggregation vs Joins

Short answer: you have to test it and individual databases vary a lot.

cletus 2009-07-29 13:35:45

Oooh, another nice plug :-)

paxdiablo 2009-07-29 13:43:37

I'm not above self-promotion. :)

cletus 2009-07-29 13:53:46

@cletus: I'm really tempted to register in-vs-join-vs-exists dot com, gather all plugs and start collecting money :

Quassnoi 2009-07-29 14:05:57

So toss me an upvote and I'll sign up. Promise. ;-)

cletus 2009-07-29 14:10:33

@cletus: your second upvote was mine :)

Quassnoi 2009-07-29 14:13:49

Answer 3

A:

The optimizer should be smart enough to give you the same result either way for normal queries. Check the execution plan and they should give you the same thing. If they don't, I would normally consider the JOIN to be faster. All systems are different, though, so you should profile the code on your system to be sure.

Joel Coehoorn 2009-07-29 13:36:08

Should do? Maybe. Does it? No. See my post.

cletus 2009-07-29 13:36:45

Answer 4

+2 A:

Each database's implementation but you can probably guess that they all solve common problems in more or less the same way. If you are using MSSQL have a look at the execution plan that is generated. You can do this by turning on the profiler and executions plans. This will give you a text version when you run the command.

I am not sure what version of MSSQL you are using but you can get a graphical one in SQL Server 2000 in the query analyzer. I am sure that this functionality is lurking some where in SQL Server Studio Manager in later versions.

Have a look at the exeuction plan. As far as possible avoid table scans unless of course your table is small in which case a table scan is faster than using an index. Read up on the different join operations that each different scenario produces.

uriDium 2009-07-29 13:36:24

Answer 5

+4 A:

Please see here: http://stackoverflow.com/questions/1001543/in-vs-join-with-large-rowsets

Good reference on this.

AdaTheDev 2009-07-29 13:36:34

I searched but I didn't find that question... hmmm thanks

Polaris878 2009-07-29 13:45:28

:) I was actually looking for a different article I used when I researched into something similar a while ago, and stumbled across that one by mistake

AdaTheDev 2009-07-29 13:48:20

@AdaTheDev: thanks for the mistake, got a free +1 on it :)

Quassnoi 2009-07-29 13:50:46

Answer 6

+11 A:

Generally speaking, IN and JOIN are different queries that can yield different results.

SELECT  a.*
FROM    a
JOIN    b
ON      a.col = b.col

is not the same as

SELECT  a.*
FROM    a
WHERE   col IN
        (
        SELECT  col
        FROM    b
        )

, unless b.col is unique.

However, this is the synonym for the first query:

SELECT  a.*
FROM    a
JOIN    (
        SELECT  DISTINCT col
        FROM    b
        )
ON      b.col = a.col

If the joining column is UNIQUE and marked as such, both these queries yield the same plan in SQL Server.

If it's not, then IN is faster than JOIN on DISTINCT.

See this article in my blog for performance details:

IN vs. JOIN vs. EXISTS

Quassnoi 2009-07-29 13:36:53

Oooh, nice plug :-)

paxdiablo 2009-07-29 13:43:06

Yeah it makes sense that they would execute the same if the joining column is unique (which it is in my case)

Polaris878 2009-07-29 13:48:29

On a similar note, should I use IN(SELECT DISTINCT ...) or simply IN(SELECT ...)?

orlandu63 2009-07-29 13:53:13

@orlandu63: `IN` implies `DISTINCT`. `SQL Server` is smart enough to notice it, and will generate same plans for both queries. Not sure, though, how other `RDBMS`'s will behave.

Quassnoi 2009-07-29 14:05:00

Answer 7

A:

A interesting writeup on the logical differences: SQL Server: JOIN vs IN vs EXISTS - the logical difference

I am pretty sure that assuming that the relations and indexes are maintained a Join will perform better overall (more effort goes into working with that operation then others). If you think about it conceptually then its the difference between 2 queries and 1 query.

You need to hook it up to the Query Analyzer and try it and see the difference. Also look at the Query Execution Plan and try to minimize steps.

AdamSane 2009-07-29 13:38:41

ansaurus

tags:

views:

answers:

SQL JOIN vs IN performance?

related questions