ansaurus

Question

Answer 1

A:

NULL looks fine to me for this purpose. Performance is likely to be basically the same as with a non-null column and constant value, or maybe even better for filtering out all NULLs.

Lucero 2009-07-01 17:21:24

Answer 2

+2 A:

They do not have a negative performance hit on the database. Remember, NULL is more of a state than a value. Checking for NOT NULL vs setting that value to a -1 makes no difference other than the -1 is probably breaking your data integrity, imo.

northpole 2009-07-01 17:23:20

Answer 3

+3 A:

SQL Server indexes NULL values, so this will most probably just use the Index Seek over an index on QuickPickOrder, both for filtering and for ordering.

Quassnoi 2009-07-01 17:25:14

if the table's column is 50% null (similar to the given sample data), I'd think it may tend to do a index scan

KM 2009-07-01 17:34:08

@KM: why doing an Index Scan here? It may do a Table Scan / Clustered Index Scan to avoid RID Lookups / Key Lookups, but we have a range condition here, so an Index Seek will always be superior.

Quassnoi 2009-07-01 17:36:39

won't it ignore the index and scan since most values are the same (null)?

KM 2009-07-01 17:38:54

@KM: it may do it, but it won't be an INDEX Scan, it will be a TABLE Scan or CLUSTERED Index Scan (which is same as Table Scan for clustered tables). Index Scan implies traversing whole index on QuickPickOrder, filtering out the wrong values and then joining with the table using Key Lookup / RID Lookup to fetch the * requested by SELECT clause. Index Seek does the same but starts from the first non-NULL value, so the NULL's are just left over.

Quassnoi 2009-07-01 17:42:20

Answer 4

+1 A:

The alternative is to normalize QuickPickOrder into a table with a foreign key, and then perform an inner join to filter the nulls out (or a left join with a where clause to filter the non-nulls out).

mgroves 2009-07-01 17:25:36

Answer 5

A:

NULL looks good to me as well. SQL Server has many kinds of indices to choose from. I forget which ones do this, but some only index values in a given range. If you had that kind of index on the column being tested, the NULL valued records would not be in the index, and the index scan would be fast.

Paul Chernoch 2009-07-01 17:25:45

Answer 6

+2 A:

Another alternative would be two tables:

MyTable:

ID | ImportantData
------------------
1  | 'Some Text'
2  | 'Other Text'
3  | 'abcdefg'
4  | 'whatever'
5  | 'it is'
6  | 'technically'
7  | 'a varchar'
8  | 'of course'
9  | 'but that'
10 | 'is not'
11 | 'important'

QuickPicks:

MyTableID   | QuickPickOrder
--------------------------
2           | 3
4           | 4
5           | 2
8           | 1
11          | 5

SELECT   MyTable.*
FROM     MyTable JOIN QuickPicks ON QuickPickOrder.MyTableID = MyTable.ID
ORDER BY QuickPickOrder

This would allow updating QuickPickOrder without locking anything in MyTable or logging a full row transaction for that table. So depending how big MyTable is, and how often you are updating QuickPickOrder, there may be a scalability advantage.

Also, having a separate table will allow you to add a unique index on QuickPickOrder to ensure no duplication, and could be more easily scaled later to allow different kinds of QuickPicks, having them specific to certain contexts or users, etc.

richardtallent 2009-07-01 17:30:35

Answer 7

A:

Having a lot of NULLs in a column which has an index on it (or starting with it) is generally beneficial to this kind of query.

NULL values are not entered into the index, which means that inserting / updating rows with NULL in there doesn't take the performance hit of having to update another secondary index. If, say, only 0.001% of your rows have a non-null value in that column, the IS NOT NULL query becomes pretty efficient as it just scans a relatively small index.

Of course all of this is relative, if your table is tiny anyway, it makes no appreciable difference.

MarkR 2009-07-02 06:38:38

Answer 8

A:

SQL Server's performance can be affected by using NULLS in your database. There are several reasons for this.

First, NULLS that appear in fixed length columns (CHAR) take up the entire size of the column. So if you have a column that is 25 characters wide, and a NULL is stored in it, then SQL Server must store 25 characters to represent the NULL value. This added space increases the size of your database, which in turn means that it takes more I/O overhead to find the data you are looking for. Of course, one way around this is to use variable length fields instead. When NULLs are added to a variable length column, space is not unnecessarily wasted as it is with fixed length columns.

Second, use of the IS NULL clause in your WHERE clause means that an index cannot be used for the query, and a table scan will be performed. This can greatly reduce performance.

Third, the use of NULLS can lead to convoluted Transact-SQL code, which can mean code that doesn't run efficiently or that is buggy.

Ideally, NULLs should be avoided in your SQL Server databases.

Instead of using NULLs, use a coding scheme similar to this in your databases:

NA: Not applicable
NYN: Not yet known
TUN: Truly unknown

Such a scheme provides the benefits of using NULLs, but without the drawbacks.

Prashant 2009-07-02 06:52:04

ansaurus

tags:

views:

answers:

Query Performance with NULL

related questions