views:

489

answers:

4

For you database design/performance gurus out there.

If you have a database that is intended to track financial data for fiscal year periods, is it better/more performance/more clear to do daterange type searches like PaymentDate Between X and Y or is it better to keep a int-key based table with fiscal year periods defined in it and tag the payment table with the payment date and that key, so the where clause is where FiscalPeriodID = X?

I'm sure for smaller datasets it doesn't matter, but let's assume this data will be in the millions of rows.

A: 

What you end up doing with significantly large financial datasets is 'data cubes'.

This basically refers to the process of generating the reports you need for each period, historically, so you don't need to do these where clauses, you simply view the data for that period.

So it doesn't matter. Store it however, and implement a historical database that will be more efficient for long-term reporting.

I'd go with the date stored directly against the entry.

Noon Silk
A: 

If you can use smalldatetime it is the same size as integer - both 4 bytes. And under the hood the datetime datatypes are integers.

The first 2 bytes of smalldatetime are something like the number of days elapsed since maybe 1/1/1900 and the second 2 bytes are something like the number of seconds elapsed since midnight. (This might not be exact but you get the point.) So these datatypes are very efficient.

I think a where clause performed against the smalldatetime field will be fine.

+5  A: 

I deal with warehouses in the millions of rows on a daily basis, and we find that smart date keys are the way to go. This is in the format of YYYYMMDD. So to find all of 2008, you'd do:

select
    *
from
    gl
where
    postdate between 20080101 and 20081231

With an indexed column this is phenomenally fast, even across one billion rows. This is also pointing to a date table, so we can tack on day of week, month names, or whatever else information about dates we have with that join.

Of course, these warehouses are usually built to support SSAS cubes (OLAP databases), and so that date table becomes our date dimension. It's much faster to join on an int than a datetime.

Eric
hrm yeah, now that I can take what you wrote and research it, it does seem this is a pretty standard solution, particularily in cubes.
Eric
+1  A: 

Also consider what is in effect the date portion of an Actual datetime or smalldatetime field... The 4-byte integer representing the number of days since 1 jan 1900.

This can be cast to an actual datetime implicitly, very fast, (since it is the exact same value as the first four bytes of an 8-byte DateTime value)

you can also use it in Where clauses against actual datetime values, since the SQL Server engine implicitly converts one to the other and back again.

Plus, every possile value of a 32-bit (4-byte) integer is a valid datetime (Midnight) for the internal SQL Server Datetime datatype

Charles Bretana