tags:

views:

17907

answers:

9

I would like to know the difference between the EXIST and IN clause in SQL?

When should we use EXIST and when IN should be used?

Thank you.

+2  A: 

I'm assuming you know what they do, and thus are used differently, so I'm going to understand your question as: When would it be a good idea to rewrite the SQL to use IN instead of EXISTS, or vice versa.

Is that a fair assumption?


Edit: The reason I'm asking is that in many cases you can rewrite an SQL based on IN to use an EXISTS instead, and vice versa, and for some database engines, the query optimizer will treat the two differently.

For instance:

SELECT *
FROM Customers
WHERE Exists (
    SELECT *
    FROM Orders
    WHERE Orders.CustomerID = Customers.ID
)

can be rewritten to:

SELECT *
FROM Customers
WHERE ID IN (
    SELECT CustomerID
    FROM Orders
)

or with a join:

SELECT Customers.*
FROM Customers
    INNER JOIN Orders ON Customers.ID = Orders.CustomerID

So my question still stands, is the original poster wondering about what IN and EXISTS does, and thus how to use it, or does he ask wether rewriting an SQL using IN to use EXISTS instead, or vice versa, will be a good idea?

Lasse V. Karlsen
I don't know about the OP, but I would like the answer to this question! When should I use EXISTS instead of IN with a subquery that returns IDs?
Roy Tinker
Unfortunately I'm on vacation right now and won't be typing long answers through my iPad, I'll bookmark this and come back to it when I return, after the 25th of July. I apologize if that is insufficient.
Lasse V. Karlsen
+10  A: 

EXISTS will tell you whether a query returned any results. eg:

SELECT * FROM Orders o WHERE EXISTS (
    SELECT * FROM Products p where p.ProductNumber = o.ProductNumber)

IN is used to compare one value to several, and can use literal values, like this:

SELECT * FROM Orders WHERE ProductNumber IN (1, 10, 100)

You can also use query results with the IN clause, like this:

SELECT * FROM Orders WHERE ProductNumber IN (
    SELECT ProductNumber FROM Products WHERE ProductInventoryQuantity > 0)
Matt Hamilton
+1  A: 

I think,

  • EXISTS is when you need to match the results of query with another subquery. Query#1 results need to be retrieved where SubQuery results match. Kind of a Join.. E.g. select customers table#1 who have placed orders table#2 too

  • IN is to retrieve if the value of a specific column lies IN a list (1,2,3,4,5) E.g. Select customers who lie in the following zipcodes i.e. zip_code values lies in (....) list.

When to use one over the other... when you feel it reads appropriately (Communicates intent better).

Gishu
+2  A: 

There is a post here http://wiki.lessthandot.com/index.php/Subquery_typo_with_using_in by Denis Gobo on the dangers of IN.

Remou
+5  A: 

There is an excellent article by Mladen Prajdić which summarises the fine difference

SQL Server: JOIN vs IN vs EXISTS - the logical difference

I'm not sure if this applies to other RDBMS - anyone know?

Flubba
+11  A: 

The exists keyword can be used in that way, but really it's intended as a way to avoid counting:

//this statement needs to check the entire table
select count(*) from [table] where ...

//this statement is true as soon as one match is found
exists ( select * from [table] where ... )

This is most useful where you have if conditional statements, as exists can be a lot quicker than count.

The in is best used where you have a static list to pass:

 select * from [table]
 where [field] in (1, 2, 3)

When you have a table in an in statement it makes more sense to use a join, but it doesn't really matter. The query optimiser will return the same plan either way.

Keith
A: 

Tom from "Ask Tom" answers this well I think.

You also need to watch out for where are "nulls" in the sub-query, which will be handled differently by "in" and "exists" - again answered by Tom.

Nick Pierpoint
A: 

EXIST is very much faster than IN

RAVI
A: 

based rule optimizer the 1 EXIST is very faster than IN (when the subquery results is very large) 2 in is faster than exist (when the subquery results is very small)

based cost optimizer there is not difference

jackson