tags:

views:

68

answers:

3

I have a table with a nullable datetime field.

I'll execute queries like this:

select * from TABLE where FIELD is not null
select * from TABLE where FIELD is null

Should I index this field or is not necessary? I will NOT search for some datetime value in that field.

+1  A: 

In short: yes.

Slightly longer: yeeees. ;-)

(From http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html) - "A search using col_name IS NULL employs indexes if col_name is indexed."

middaparka
This is incorrect. Create a table with X rows, half of them with null datetime value, half of them with not null. Create an index, run `explain select` on your query - index will not be used.
ChssPly76
There is no error in documentation but you do need to read and understand the whole thing, not just the excerpt that seems to deal with the question at hand. Here's another (more relevant) quote from the same link: "Sometimes MySQL does not use an index, even if one is available. One circumstance under which this occurs is when the optimizer estimates that using the index would require MySQL to access a very large percentage of the rows in the table. (In this case, a table scan is likely to be much faster because it requires fewer seeks.)"
ChssPly76
+2  A: 

It's probably not necessary.

The only possible edge case when index can be used (and be of help) is if the ratio of null / not-null rows is rather big (e.g. you have 100 NULL datetimes in the table with 100,000 rows). In that case select * from TABLE where FIELD is null would use the index and be considerably faster for it.

ChssPly76
A: 

It would depend on the number of unique values and the number of records in the table. If your just searching on whether or not a column is null or not, you'll probably have one query use it and one not depending on the amount of nulls in the table overall.

For example: If you have a table with 99% of the records have the querying column as null and you put/have an index on the column and then execute:

SELECT columnIndexed FROM blah WHERE columnIndexed is null;

The optimizer most likely won't use the index. It won't because it will cost more to read the index and then read the associated data for the records, than to just access the table directly. Index usage is based on the statistical analysis of a table, and one major player in that is cardinality of the values. In general, indexes work best and give the best performance when they select a small subset of the rows in the table. So if you change the above query to select where columnIndexed is not null, your bound to use the index.

For more details check out the following: http://dev.mysql.com/doc/refman/5.1/en/myisam-index-statistics.html

RC