views:

402

answers:

2

How to compare equality of value in SQL with null?

For those familiar with C#, here are the results of comparing nullable values:

null == null : true
null == john : false
null == paul : false
john == null : false
john == john : true
john == paul : false
paul == null : false
paul == john : false
paul == paul : true

The easiest solution I found in SQL is to coalesced the nullable fields into some sentinel value(e.g. 'scoobydoo') then compare them

coalesce(A, 'scoobydoo') = coalesce(B, 'scoobydoo')

But that is plain kludge if somebody uses the sentinel value, if A happens to be NULL and B is 'scoobydoo', then the expression above would yield true

This is exactly my purpose on asking for the logic of the code above (T-SQL UPDATE trigger):

-- detect if the value changes

if (select invoice_date from inserted) <> 
   (select invoice_date from deleted) begin

    -- do something to summary tables here

end

How to do equality comparison in SQL with C#-like behavior?

[EDIT: Found the answer here]

Tested the code (Postgres nice boolean support, FTW!):

select

    A, B,

    A = B,
    A IS NOT DISTINCT FROM B, -- "logically" same as above

    A <> B,
    A IS DISTINCT FROM B -- "logically" same as above

from(    
    values
    (null, null),
    (null, 'john'),
    (null, 'paul'),
    ('john', null),
    ('john', 'john'),
    ('john', 'paul'),
    ('paul', null),
    ('paul', 'john'),
    ('paul', 'paul')) as x(A,B)

[EDIT: Tested Jon's code, his answer on equality sorts of semi-work(just treat the null as false anyway), but his answer on inequality bombs out]

Tested the code (Postgres nice boolean support, FTW!):

select

    A, B,

    A = B,
    A IS NOT DISTINCT FROM B, -- "logically" same as above
    coalesce( (A = B) or (A is null and B is null), false ), 
    -- tested Jon's code for ==, semi-work, coalesced to make it true/false only


    A <> B,
    A IS DISTINCT FROM B, -- "logically" same as above
    (A <> B) and (A is not null or B is not null)  
    -- tested Jon's code for !=, bombs out

from(    
    values
    (null, null),
    (null, 'john'),
    (null, 'paul'),
    ('john', null),
    ('john', 'john'),
    ('john', 'paul'),
    ('paul', null),
    ('paul', 'john'),
    ('paul', 'paul')) as x(A,B)

[EDIT: posted another question related to this one]

[EDIT: posted results based on Jon's inquiry on non-working semantics for inequality comparison]

select

    A, B,

    A = B,
    A IS NOT DISTINCT FROM B, -- "logically" same as above
    (A = B) or (A is null and B is null), 
    -- tested Jon's code for ==


    A <> B,
    A IS DISTINCT FROM B -- "logically" same as above,
    (A <> B) and (A is not null or B is not null)  
    -- tested Jon's code for !=, bombs out

from(    
    values
    (null, null),
    (null, 'john'),
    (null, 'paul'),
    ('john', null),
    ('john', 'john'),
    ('john', 'paul'),
    ('paul', null),
    ('paul', 'john'),
    ('paul', 'paul')) as x(A,B)


  a   |  b   | ?column? | ?column? | ?column? | ?column? | ?column? | ?column?
------+------+----------+----------+----------+----------+----------+----------
 null | null | null     | t        | t        | null     | f        | f
 null | john | null     | f        | null     | null     | t        | null
 null | paul | null     | f        | null     | null     | t        | null
 john | null | null     | f        | null     | null     | t        | null
 john | john | t        | t        | t        | f        | f        | f
 john | paul | f        | f        | f        | t        | t        | t
 paul | null | null     | f        | null     | null     | t        | null
 paul | john | f        | f        | f        | t        | t        | t
 paul | paul | t        | t        | t        | f        | f        | f
(9 rows)

the non-working semantics for inequality prompted me to post another question :-)

[EDIT: Tested Jon's new answer]

select

    A, B,

    A = B as e,
    A IS NOT DISTINCT FROM B AS e_works, -- "logically" same as above
    (A = B) or (A is null and B is null) AS e_semi_work, -- tested Jon's code for ==, works if we treat null as false


    A <> B as ie,
    A IS DISTINCT FROM B as ie_works, -- "logically" same as above,
    (A <> B) and (A is not null or B is not null) as ie_not_work, -- tested Jon's code for !=, bombs out

    (A <> B) or ((A is null or B is null) and (A is not null or B is not null)) as ie_semi_works, -- this works(well it is, if you treat null as false),

     not ((A = B) or (A is null and B is null)) as ie_not_work2 -- this doesn't work


from(    
    values
    (null, null),
    (null, 'john'),
    (null, 'paul'),
    ('john', null),
    ('john', 'john'),
    ('john', 'paul'),
    ('paul', null),
    ('paul', 'john'),
    ('paul', 'paul')) as x(A,B)

Results:

  a   |  b   | e    | e_works | e_semi_work | ie   | ie_works | ie_not_work | ie_semi_works | ie_not_work2
------+------+------+---------+-------------+------+----------+-------------+---------------+--------------
 null | null | null | t       | t           | null | f        | f           | null          | f
 null | john | null | f       | null        | null | t        | null        | t             | null
 null | paul | null | f       | null        | null | t        | null        | t             | null
 john | null | null | f       | null        | null | t        | null        | t             | null
 john | john | t    | t       | t           | f    | f        | f           | f             | f
 john | paul | f    | f       | f           | t    | t        | t           | t             | t
 paul | null | null | f       | null        | null | t        | null        | t             | null
 paul | john | f    | f       | f           | t    | t        | t           | t             | t
 paul | paul | t    | t       | t           | f    | f        | f           | f             | f
(9 rows)
+4  A: 

Edit again... coalescing the result should work and makes things a bit simpler:

Equality:

where COALESCE((A = B) or (A is null and B is null), false)

I agree it's not terribly pleasant.

EDIT: Vilx pointed out a problem with A <> B. I think this will work though:

where (A <> B) or ((A is null or B is null) and
                   (A is not null or B is not null))

It may be simpler to do this though:

where !(COALESCE((A = B) or (A is null and B is null)), false)
Jon Skeet
+1 for providing the under-the-hood semantics of IS DISTINCT FROM
Hao
Umm... pardon me if I'm wrong, but wouldn't the second one fail if A would be NULL and B would not be?
Vilx-
@Vilx: Which bit are you concerned about? `A<>B` will be true, and `(A is not null or B is not null)` will be true because B isn't null - so the overall result is true, as desired.
Jon Skeet
No, A<>B will be false, because one of them will be NULL, and comparing anything to NULL results in NULL. Well, unless you play around with ANSI_NULLS, but then you don't need all this anyway.
Vilx-
Ah, you're absolutely right - oops! Will fix.
Jon Skeet
@Jon: this works(if we treat null as false): where (A <> B) or ((A is null or B is null) and (A is not null or B is not null)); this doesn't: where not((A = B) or (A is null and B is null)) see my query result's above
Hao
@Hao: Right - I'll get rid of the latter then :)
Jon Skeet
+3  A: 

If it's Microsoft SQL Server, then you're looking for the ANSI_NULLS option. If it's another DBMS, you'll have to read the documentation for it. Some of them don't support this at all.

Added: Oh, I noticed you mentioning T-SQL. It is MSSQL then! :)

Vilx-
I like it! I know it's not the *standard*, but damn! Thanks.
Kieron
In theory, it could also be Sybase.
ammoQ
Umm, OK, didn't know that. :)
Vilx-