views:

173

answers:

7

So, I asked a question this morning, which I did not phrase correctly, so I got a lot of responses as to why NULL compared to anything will give NULL/FALSE.

My actual question was, what is the time honored fashion in which db guys test inequalities for two columns that can both be NULL. My question is the exact opposite of this question.

The requirements are as follows, A and B are two columns:
a) if A and B are both NULL, they are equal, return FALSE
b) if A and B are both not NULL, then return A<>B
c) if either A or B are NULL, they are not equal, return TRUE

+3  A: 

Depending on the data type and possible values for the columns:

COALESCE(A, -1) <> COALESCE(B, -1)

The trick is finding a value (here I used -1) that will NEVER appear in your data.

The other way would be:

(A <> B) OR (A IS NOT NULL AND B IS NULL) OR (A IS NULL AND B IS NOT NULL)

This can be a problem depending on how your particular RDBMS handles NULLs. By the ANSI standard, this should give you what you want, but who follows standards anyway. :)

P.S. - I should also point out that using the COALESCE function may invalidate the use of indexes in comparing the columns. Check your query plan and performance of the query to see if that's a problem.

P.P.S. - I just noticed that OMG Ponies mentioned that Informix doesn't support COALESCE. It's an ANSI standard function I believe, but see what I said above about standards...

Tom H.
@Tom: Not that it doesn't support it, just that it seems to depend on the version/edition.
OMG Ponies
And wrapping the column in any function means not using an index if present.
OMG Ponies
The coalesce query does not work if -1 is a valid value for the column. If I code it like that, and today -1 is not a valid value, and later it becomes a valid value, this will be a bug where a null field is equal to a -1 field.
prmatta
COALESCE()ing is great for this purpose
Josh Smeaton
@OMG Ponies - Thanks for the clarification. As for the function invalidating indexes, I mentioned that above in my P.S.
Tom H.
@prmatta - That's why I used all cap, bold letters for "never". Personally, I use the second format for just this reason (and the indexing issue) when coding stored procedures, etc.
Tom H.
OMG Ponies
@Tom: I ended up using your second format.
prmatta
Glad to hear that I could help :)
Tom H.
A: 

If you want to be sure about how NULLs are handled, you'll have to use whatever Informix supports for null checking. I haven't turned up much, other than the SE version doesn't support COALESCE, but it does support DECODE and possibly CASE.

WHERE COALESCE(t.a, 0) != COALESCE(t.b, 0)
WHERE DECODE(NULL, 0, t.a) != DECODE(NULL, 0, t.b)
OMG Ponies
Coalescing to 0 is probably bad since would be equal to false.
jleedev
@jleedev: The requirement is that `a` and `b` should not equal each other.
OMG Ponies
If one is `0` and the other is `NULL`, then you get `0 != 0`, which is false where it should be true.
jleedev
@jleedev: It's an example - we don't know what the data type is.
OMG Ponies
+2  A: 

I would personally write out the expression you came up with, especially if the table is expected to grow large. Wrapping the columns in function calls hurts performance by making it so the engine can't use any indexes you have on those columns. Of course, in a small table, this may not be any sort of issue, but I still like to do it the explicit way just in case a table ends up growing.

Donnie
A: 

For SQL Server, use:

WHERE ISNULL(A, '') <> ISNULL(B, '')
Gordon Bell
This doesn't work if '' is a valid value for the column.
prmatta
The DB is informix
OMG Ponies
+1  A: 

can you try something like this in informix?

CASE
    WHEN a IS NULL AND B IS NULL THEN false 
    WHEN a IS NULL OR B IS NULL THEN true
    ELSE a <> B
END

from IBM Informix Guide to SQL: Syntax , CASE Expressions

astander
I don't know if that would work in a WHERE clause
OMG Ponies
Yes, CASE-WHEN, COALESCE and NULLIF are all ANSI SQL92 and work in WHERE.
bobince
A: 

The trouble is that a<>b (or a=b) yields NULL, not 1 or 0 when one or both operands are NULL. This doesn't matter for the = case because NULL OR 1 is 1 and NULL OR 0 is NULL which behaves like 0 for selecting in a WHERE clause.

You could say:

a<>b OR (a IS NULL)<>(b IS NULL)

However needing to do it either way may be a sign that you're misusing NULL and should consider changing the schema to use some other NOT NULL value to signify this comparable condition.

For example if you've got a person table with a title column, don't use NULL to signify that they have no title; that's not a ‘missing’ datum, it's just that no title exists. So store it as an empty string '' that you can happily compare with other empty strings. (Well unless you run Oracle of course, with its Empty String Problem...)

bobince
Actually, my query does not work. If only one of A or B is NULL, my query will return FALSE, which is wrong.
prmatta
Ah... actually not 0 (false), it's returning NULL due to the comparison... but then in the equals case it doesn't matter and in the unequals it does. Editing...
bobince
@bobince your solution is very elegant, and correct, however it gives a syntax error in informix 11.5
prmatta
Really? What's the error? Looks like valid ANSI SQL:1992 to me, unless I'm drunk and have missed something. not that I'd ever be drunk on SO, obviously. Hardly ever. Except sometimes. But mostly not.
bobince
Doesn't the following work for Informix? where a <> b or nvl(a, 't') <> nvl(b, 't')
calvinkrishy
A: 

IBM Informix Dynamic Server has a somewhat peculiar view of booleans for a variety of historical (aka 'bad') reasons. Adapting the idea suggested by @astander, this CASE expression 'works', but I'd be the first to say 'not obvious' (see - I said it before you did!). The setup phase:

create table x(a int, b int);
insert into x values(null, null);
insert into x values(null, 1);
insert into x values(1, null);
insert into x values(1, 1);
insert into x values(1, 2);

The SELECT statement:

SELECT *
  FROM x
  WHERE   CASE
          WHEN a IS NULL AND b IS NULL THEN 'f'::BOOLEAN
          WHEN a IS NULL OR  b IS NULL THEN 't'::BOOLEAN
          WHEN a != b                  THEN 't'::BOOLEAN
          ELSE                              'f'::BOOLEAN
          END
;

The result from this query is:

                 1
      1           
      1          2

Issues:

  • IDS does not recognize FALSE or TRUE or UNKNOWN as keywords.
  • IDS does not recognize boolean expressions such as 'a != b' (or 'a <> b') as such.

Yes, it pains me greatly to have to state this.

Jonathan Leffler