tags:

views:

619

answers:

4

It seems to me that you can do the same thing in a SQL query using either NOT EXISTS, NOT IN, or LEFT JOIN WHERE IS NULL. For example:

SELECT a FROM table1 WHERE a NOT IN (SELECT a FROM table2)

SELECT a FROM table1 WHERE NOT EXISTS (SELECT * FROM table2 WHERE table1.a = table2.a)

SELECT a FROM table1 LEFT JOIN table2 ON table1.a = table2.a WHERE table1.a IS NULL

I'm not sure if I got all the syntax correct, but these are the general techniques I've seen. Why would I choose to use one over the other? Does performance differ...? Which one of these is the fastest / most efficient? (If it depends on implementation, when would I use each one?)

+9  A: 

In a nutshell:

NOT IN is a little bit different: it never matches if there is but a single NULL in the list.

  • In MySQL, NOT EXISTS is a little bit less efficient

  • In SQL Server, LEFT JOIN / IS NULL is less efficient

  • In PostgreSQL, NOT IN is less efficient

  • In Oracle, all three methods are the same.

Quassnoi
Thanks for the links! And thanks for the quick overview... My office is blocking the link for some reason :P but I'll check it out as soon as I get to a regular computer.
froadie
A: 

If the database is good at optimising the query, the two first will be transformed to something close to the third.

For simple situations like the ones in you question, there should be little or no difference, as they all will be executed as joins. In more complex queries, the database might not be able to make a join out of the not in and not exists queryes. In that case the queries will get a lot slower. On the other hand, a join may also perform badly if there is no index that can be used, so just because you use a join doesn't mean that you are safe. You would have to examine the execution plan of the query to tell if there may be any performance problems.

Guffa
A: 

Just tried in my own case on my database in MS Access 2007 (T-SQL)

  • NOT IN - about 10 sec
  • NOT EXISTS - about 1.5 sec
baleks
A: 

When need to insert data in table with multi-field primary key, consider that it will be much faster (i tried in Access but i think in any Database) not to check that "not exists records with 'such' values in table", - rather just insert into table, and excess records (by the key) will not be inserted twice. good luck to all ;)

baleks