views:

813

answers:

3

Which SQL would be faster to validate if a particular column has a null value or not, why?

1) SELECT * FROM TABLE1 WHERE COL1 IS NULL

Execute this query and then check if you are able to read any records. If yes there are null values.

2) SELECT COUNT(COL1) FROM TABLE1 WHERE COL1 IS NULL

Read the count which is returned to determine if there are any null records

Working with Oracle10g and SQLServer2005.

+4  A: 

I don't know about Oracle, but for SQL Server this option is probably going to be fastest of all:

SELECT TOP 1 COL1 FROM TABLE1 WHERE COL1 IS NULL;

That way the DBMS only has to read a single row before giving you your answer; the other options have to read all non-null rows. And I've specified COL1 instead of *, so it might be possible to satisfy the query with an index on COL1, making the query faster still.

kquinn
Just to pick a nit, the database may still have to read multiple rows. It will just be able to stop when it finally reaches one that fits the criteria. It still has to read over any that fail the criteria until it gets to that one. Also, using EXISTS would give the same advantage and I believe that it would be ANSI compatible.
Tom H.
+5  A: 

Building on kquinn's answer, in Oracle that would be

SELECT COL1 FROM TABLE1 WHERE COL1 IS NULL AND ROWNUM = 1;

That way the DBMS only has to read a single row before giving you your answer;

That statement is misleading, however. It has to read all rows until it finds one with the missing column value. Then it can stop and return that row.

If there is no such row, it will read the whole table.

so it might be possible to satisfy the query with an index on COL1, making the query faster still.

Specifying only COL1 will not have too much impact, at least on Oracle, where (regular B-Tree) indices cannot be used to find NULL values.

You may want to select more columns anyway (such as the primary key value) if you are interested in identifiying the row later.

Thilo
Hm, in PostgreSQL (my DB of choice), B-Tree indexes (the default) *do* store information about NULL values and *can* be used to satisfy IS NULL conditions on queries; I agree with you that making the index cover the query won't help much if indexes can't be used anyway. You'll want to look into how SQL Server indexes handle NULLs and see if it behaves like Postgres or Oracle.
kquinn
+4  A: 

Count(columnName) will NEVER count NULL values, count skips NULLS when you specify a column name and does count NULLS when you use *

run this

CREATE TABLE testnulls (ID INT)
INSERT INTO testnulls VALUES (1)
INSERT INTO testnulls VALUES (2)
INSERT INTO testnulls VALUES (null)

SELECT count(*) FROM testnulls WHERE ID IS NULL --1

SELECT count(ID) FROM testnulls WHERE ID IS NULL --0

I would use exists instead since it is a boolean operation and will stop at the first occurance of NULL

IF EXISTS (SELECT 1 FROM testnulls WHERE ID IS NULL)
PRINT 'YES'
ELSE
PRINT 'NO'
SQLMenace