views:

59

answers:

4

I need to explain this by example:

Is there a best practice or preference for specifying a DateTime and BIT in a database table?

In my database I have a Widget table. I need to know if a widget is "Closed" and it's "Closed Date" Business rules say that if a widget is closed, it must have a closed date. If a widget is not closed, it should not have a "Closed Date".

To design this, I could do the following:

(Example 1):

CREATE TABLE [Widget]
(
    [WidgetID] INT IDENTITY(1,1)
    ,[ClosedDate] DATETIME NULL
)

or (Example 2):

CREATE TABLE [Widget]
(
    [WidgetID] INT IDENTITY(1,1)
    ,[IsClosed] BIT NOT NULL CONSTRAINT [DF_Widget_IsClosed] DEFAULT (0)
    ,[ClosedDate] DATETIME NULL
)

I think that Example 1 is cleaner because it is one less column to have to worry about. But, whenever I need to evaluate whether a Widget is Closed, I would need an extra step to figure out if the ClosedDate column IS NOT NULL.

Example 2 creates extra overhead because now I have to keep both the IsClosed and ClosedDate values in sync.

Is there a best practice when designing something like this? Would querying the table be more performant for Example 2? Is there any reason why I should choose one design over the other?

Note: I would be accessing this value through an ORM tool as well as Stored Procedures.

+1  A: 

The first is better. Checking for null is cheap, whereas keeping a separate flag makes it possible to have a closed date yet not be closed.

Steven Sudit
+3  A: 

I think that option 1 is better. Data integrity is better kept (impossible to have a closed date with a flag which says the inverse), takes less disk space in the case of extra large tables, and queries would still be performant and clear to understand for teammates.

MaxiWheat
A: 

I would not assign semantic meaning to NULL. Doing so will bubble through your business logic and you will get code like ...

public class Widget
{
  // stuff

  public bool IsClosed
  {
    // what do you put here?
    // it was null in the db so you have to use DateTime.MinDate or some such.
    return( _closeDate == ?? );  
  }

  // more stuff
}

Using null in that fashion is bad. NULL (and null) mean "I don't know". You are assigning semantic meaning to that answer when in reality, you should not. The closed status is the closed status and the closed date is the closed date, don't combine them. (God forbid you ever want to re-open a Widget but still remember when it got closed in the first place, for example.)

Eric Lippert has a nice blog post on using null in this way (kidna) as well.

JP Alioto
In the C#, you'd use a DateTime? (Nullable<DateTime>) to correspond to the database field, so that's what you'd be doing in that return statement. I can't say I agree with your advice, because it goes against DRY.
Steven Sudit
I just read Lippert's post and don't see how it supports your view. Null means that no value exists, which is different from any existing value, including empty. In this case, a null close date means that no close date exists, since it's not closed. If you wanted to reopen it and keep track of previous close dates, I suggest you put that in a history table.
Steven Sudit
+2  A: 

I think you have the IsClosed column as a computed column.

CREATE TABLE [Widget](    
[WidgetID] INT IDENTITY(1,1),
[ClosedDate] DATETIME NULL,
IsClosed AS CAST(CASE WHEN ClosedDate IS NULL THEN 0 ELSE 1 END AS BIT)
)

The reason is that you are not storing anything and you can now code your application code and stored procs to use this column. If your business rule ever changes you can convert this into a real column and you will not need to change other code. Otherwise you will have business logic sprinkled throughout your application code and stored procs. This way, it is only in 1 place.

Finally, when you move to SQL2005 you can add the "Persisted" clause. So it will be stored increasing the performance slightly and you will not have an issue with keeping them in sync.

JBrooks
+1 perfect - my thoughts exactly! One column you're working with, and a computed column (which is automatically updated) to signal the condition.
marc_s
This is an interesting compromise. For one thing, even if it's not persisted, you can create an index on it, which effectively persists it. On the other hand, I'm quite comfortable with using null to mean null.
Steven Sudit