views:

110

answers:

1

I have a table with a InTime and an OutTime column.

Normally when I insert data into this table I set the InTime to a DateTime and the OutTime to null. When the data is removed a OutTime value is set.

When I’m getting data out for a particular time I use something like:

where InTime < sometime and OutTime is > sometime or OutTime is null

My question is, in terms of getting better query / index performance should I be putting some value into OutTime like the max datetime and make the field not nullable?

Then my query becomes

where InTime < sometime and OutTime is > sometime
+2  A: 

Leave the field NULL. Don't use OR, use UNION ALL:

select ... from ... where InTime < sometime and OutTime is > sometime 
union all
select ... from ... where InTime < sometime and OutTime is null

Using magic values instead of NULL is a recipe for disaster. At the least, it uses more storage. More specifically, it breaks the semantics of NULL when enforcing database constraints, when computing aggregates and in applications.

Using OR in queries is asking for performance trouble. The optmizer will likely turn any index range seeks into scans. Using UNION is usually better, as the optimizer will create two plans, one optimal for the NULLs one otpimal for the non-NULL, and union them.

If you do not have any index on Intime and/or OutTime then the query will be a scan anyway and the UNION will perform worse than an OR, but then that is not a scenario that is worth talking about. The question is, of course, how to optimize a query on properly designed storage.

Remus Rusanu
If you view the execution plan of the original query with the union all query. The union all is 67% relative to the batch.
Spruce
@Spruce: what about IO and CPU and Duration? The % relative is sometimes misleading.
gbn
I've tried this with a couple of queries now, one with a index scan and one with a index seek. In both cases the union all statement runs the query twice and then concatenates the results which is reflected in the %cost relative to batch. As you’d expect the IO and CPU are double for the union all query because it has to run it twice. Am I missing something?
Spruce
Query optimization is equaly, if not more, driven by existing schema as it is by the actual SQL text. You did not supply any schema with your question and, as I said, if the query scans the whole table to produce the result then UNION ALL will actually hurt. that doesn't mean you should use OR, you should fix the schema to avoid unnecessary table scans (in other words, index your tables).
Remus Rusanu