tags:

views:

50

answers:

3

I have quite a few situations where I have database structures similar to:

TABLE Event (EventID INT PrimaryKey, Start DATETIME, Finish DATETIME); and TABLE EventTask (EventTaskID INT PrimaryKey, EventID INT ForeignKey, TaskStart DATETIME, TaskFinish DATETIME)

1 to many relationship between Events and EventTasks, etc.

When the dates in the Event table are changed, the EventTask dates need to be changed as well - not difficult with supplied date functions, but date manipulation is always just a bit tricky.

I was wondering if it might be better to replace the TaskStart DATETIME and TaskFinish DATETIME with two INTS: one for Event.Start offset (mins different to Event.Start) and a Duration.

This way date updates become much more robust, as only a single row needs an update.

(I stipulate that this only applies where the EventTask dates are absolutely dependent on the Event dates)

+2  A: 

Yes, that sounds entirely reasonable to me. The main downside would be that in order to find the actual times of EventTasks, you have to perform calculations. That will slow down anything returning the times, and in particular will hurt queries involving EventTask times in the filter - e.g. "find me all tasks which occur between times X and Y." Those could previously have used an index, but won't be able to any more.

Jon Skeet
Good point re querying the EventTask times Jon. The 'original' design would need a view to recreate it anyway; I wonder if you could apply an index to this view (as per SQLS2008) ?
realcals
@realcals: Possibly. I've no idea how efficient it would be. Database engineers are frighteningly clever people sometimes :) Anyway, something to test before you get too far down the road, I'd say.
Jon Skeet
A: 

If you are using SQL 2008 you could use datetimeoffset data type.

If you would like to get "directly" the data, without too much hassle you could use computed columns, but you might not be able to create indexes (or make them store the result) on them if the result is non deterministic.

Your structure would be like this:

TABLE [Event] (
    EventID INT PrimaryKey, 
    Start DATETIME, 
    Finish DATETIME)

TABLE [EventTask](
    EventTaskID INT PrimaryKey, 
    EventID INT ForeignKey, 
    TaskStart DATETIMEOFFSET, 
    TaskFinish DATETIMEOFFSET,
    EventTaskStart as [getStartDateByEvent](eventId, TaskStart) <PERSISTED>,
    EventTaskStop as [getStopDateByEvent](eventId, TaskStart) <PERSISTED>,
    )

FUNCTION [getStartDateByEvent](eventId, TaskStart) as DATETIME
BEGIN
    SELECT [EVENT].start + TaskStart from [EVENT] WHERE [EVENT].EVENTID = eventID
END

FUNCTION [getStartDateByEvent](eventId, TaskStop) as DATETIME
BEGIN
    SELECT [EVENT].[finish] + TaskStop from [EVENT] WHERE [EVENT].EVENTID = eventID
END
Bogdan Maxim
I don't see how that helps - it doesn't let you specify an offset from a field in a different table, does it?
Jon Skeet
you still have to compute the value.You cannot use computed columns that reference directly another table, but you can do it with user defined functions
Bogdan Maxim
Here is an example of computed columns with datetime result: http://stackoverflow.com/questions/246666/persisting-a-computed-datetime-column-in-sql-server-2005/247092#247092
Bogdan Maxim
The functions need to be declared with "SCHEMABINDING"
Bogdan Maxim
A: 

Jon Skeet:

to perform calculations. That will slow down anything returning the times, and in particular will hurt queries involving EventTask times in the filter - e.g. "find me all tasks which occur between times X and Y."

It seems to me you already have this requirement when writing basic data integrity constraints e.g. that an event's task dates must be within the dates for the event itself e.g. expanding your schema into SQL DLL:

CREATE TABLE Event 
(
 EventID INTEGER NOT NULL PRIMARY KEY, 
 event_Start DATETIME NOT NULL, 
 event_Finish DATETIME NOT NULL, 
 CHECK (event_Start < event_Finish), 
 UNIQUE (event_Start, event_Finish, EventID)
)

CREATE TABLE EventTask 
(
 EventTaskID INTEGER NOT NULL PRIMARY KEY, 
 EventID INTEGER NOT NULL, 
 event_Start DATETIME NOT NULL, 
 event_Finish DATETIME NOT NULL, 
 FOREIGN KEY (event_Start, event_Finish, EventID)
    REFERENCES Event (event_Start, event_Finish, EventID)
    ON DELETE CASCADE
    ON UPDATE CASCADE, 
 event_task_START DATETIME NOT NULL, 
 event_task_Finish DATETIME NOT NULL, 
 CHECK (event_task_Start < event_task_Finish), 
 CHECK (event_Start <= event_task_Start), 
 CHECK (event_Finish <= event_task_Finish)
);

If an event's tasks are sequential then you'd also need to write a constraint to prevent overlapping periods and this would involve a subquery (most SQL products do not support this for CHECK constraint e.g. in SQL Server you would need to resort to a trigger). If you had to recalculate offsets using temporal functions (DATEADD etc) then performance could be a problem in a high activity environment, not to mention the increased complexity in SQL DLL code.

For these reasons I would model periods using a start and end data pair as suggested by your spec.

onedaywhen