Some co-workers and I got into a debate on the best way to store historical data. Currently, for some systems, I use a separate table to store historical data, and I keep an original table for the current, active record. So, let's say I have table FOO. Under my system, all active records will go in FOO, and all historical records will go in FOO_Hist. Many different fields in FOO can be updated by the user, so I want to keep an accurate account of everything updated. FOO_Hist holds the exact same fields as FOO with the exception of an auto-incrementing HIST_ID. Every time FOO is updated, I perform an insert statement into FOO_Hist similar to: insert into FOO_HIST select * from FOO where id = @id
.
My co-worker says that this is bad design because I shouldn't have an exact copy of a table for historical reasons and should just insert another record into the active table with a flag indicating that it's for historical purposes.
Is there a standard for dealing with historical data storage? It seems to me that I don't want to clutter my active records with all of my historical records in the same table considering that it may be well over a million records (I'm thinking long term).
How do you or your company handle this?
I'm using MS SQL Server 2008, but I'd like to keep the answer generic and arbitrary of any DBMS.
Thanks in advance for any input.