views:

32889

answers:

11

This issue came up when I got different records counts for what I thought were identical queries one using a not in where constraint and the other a left join. The table in the not in constraint had one null value (bad data) which caused that query to return a count of 0 records. I sort of understand why but I could use some help fully grasping the concept.

To state it simply, why does query A return a result but B doesn't?

A: select 'true' where 3 in (1, 2, 3, null)
B: select 'true' where 3 not in (1, 2, null)

This was on SQL Server 2005. I also found that calling set ansi_nulls off causes B to return a result.

+4  A: 

Compare to null is undefined, unless you use IS NULL.

So, when comparing 3 to NULL (query A), it returns undefined.

I.e. SELECT 'true' where 3 in (1,2,null) and SELECT 'true' where 3 not in (1,2,null)

will produce the same result, as NOT (UNDEFINED) is still undefined, but not TRUE

Sunny
+5  A: 

In A, 3 is tested for equality against each member of the set, yielding (FALSE, FALSE, TRUE, UNKNOWN). Since one of the elements is TRUE, the condition is TRUE. (It's also possible that some short-circuiting takes place here, so it actually stops as soon as it hits the first TRUE and never evaluates 3=NULL.)

In B, I think it is evaluating the condition as NOT (3 in (1,2,null)). Testing 3 for equality against the set yields (FALSE, FALSE, UNKNOWN), which is aggregated to UNKNOWN. NOT ( UNKNOWN ) yields UNKNOWN. So overall the truth of the condition is unknown, which at the end is essentially treated as FALSE.

Dave Costa
A: 

since null is an unknown a not in query containing a null in the list of possible values will always return 0 records since there is no way to be sure that the null value is not the value being tested.

YonahW
+22  A: 

Query A is the same as:

select 'true' where 3 = 1 or 3 = 2 or 3 = 3 or 3 = null

Since 3 = 3 is true, you get a result.

Query B is the same as:

select 'true' where 3 <> 1 and 3 <> 2 and 3 <> null

When ansi_nulls is on, 3 <> null is UNKNOWN, so the predicate evaluates to UNKNOWN, and you don't get any rows.

When ansi_nulls is off, 3 <> null is true, so the predicate evaluates to true, and you get a row.

Brannon
one correction: it is actually 3 value logic, so 3 <> NULL is not FALSE but UNKNOWN if it was false the query NOT (3 <> null) would evaluate to True but it is not - it is still UNKNOWN. You can test it by calling select 'true' where 3 in (null)select 'true' where 3 not in(null) -both give no result
kristof
@kristof: Yes, you're right. I will correct my answer.
Brannon
Has anybody ever pointed out that converting `NOT IN` to a series of `<> and` changes the semantic behavior of *not in this set* to something else?
Ian Boyd
+3  A: 

Null signifies and absence of data, that is it is unknown, not a data value of nothing. It's very easy for people from a programming background to confuse this because in C type languages when using pointers null is indeed nothing.

Hence in the first case 3 is indeed in the set of (1,2,3,null) so true is returned

In the second however you can reduce it to

select 'true' where 3 not in (null)

So nothing is returned because the parser knows nothing about the set to which you are comparing it - it's not an empty set but an unknown set. Using (1, 2, null) doesn't help because the (1,2) set is obviously false, but then you're and'ing that against unknown, which is unknown.

Cruachan
A: 

Any compare against NULL is always FALSE. You can use IS NULL or IS NOT NULL only.

DiGi
Not true. Any compare against null is null.Here's some fun reading. http://en.wikipedia.org/wiki/Null_(SQL)#Boolean_datatype_inconsistency
David B
A: 

also this might be of use to know the logical difference between join, exists and in http://weblogs.sqlteam.com/mladenp/archive/2007/05/18/60210.aspx

Mladen
+4  A: 

Whenever you use NULL you are really dealing with a Three-Valued logic.

Your first query returns results as the WHERE clause evaluates to:

    3 = 1 or 3 = 2 or 3 = 3 or 3 = null
which is:
    FALSE or FALSE or TRUE or UNKNOWN
which evaluates to 
    TRUE

The second one:

    3 <> 1 and 3 <> 2 and 3 <> null
which evaluates to:
    TRUE and TRUE and UNKNOWN
which evaluates to:
    UNKNOWN

The UNKNOWN is not the same as FALSE you can easily test it by calling:

select 'true' where 3 <> null
select 'true' where not (3 <> null)

Both queries will give you no results

If the UNKNOWN was the same as FALSE then assuming that the first query would give you FALSE the second would have to evaluate to TRUE as it would have been the same as NOT(FALSE).
That is not the case.

There is a very good article on this subject on SqlServerCentral

The whole issue of NULLs and Three-Valued Logic can be a bit confusing at first but it is essential to understand in order to write correct queries in TSQL

Some other read I would recommend is SQL Aggregate Functions and NULL

Hope that helps,

kristof
+1  A: 

I am facing same issue. There are records for this query. So it should return at least one row. But it does not. Can anyone tell me how to run this query.

SQL> select party_code from abc where party_code not in (select party_code from xyz);

no rows selected

SQL>

A: 

this is for Boy:

select party_code from abc as a where party_code not in ( select party_code from xyz where party_code = a.party_code );

this works regardless of ansi settings

for the original question:B: select 'true' where 3 not in (1, 2, null)a way to remove nulls must be done e.g.select 'true' where 3 not in (1, 2, isnull(null,0))the overall logic is, if NULL is the cause, then find a way to remove NULL values at some step in the query.
select party_code from abc as a where party_code not in ( select party_code from xyz where party_code is not null)but good luck if you forgot the field allows nulls, which is often the case
A: 

Below link provide good information about NULL in SQL and how to handle the same in SQL queries

http://www.a2zmenu.com/MySql/Working-with-NULL-Values-in-SQL.aspx

rs.emenu