Recently I’ve found myself in a database tangle where management wants the ability to remove data from the database, but still wants that data to appear in other places. Example: They want to remove all instances of the product whizbang, but they still want whizbang to appear in sales reports. (if they ran one for a previous date).
Now I can add a field, say is_deleted, that will track whether that product has been deleted and thus still keep all my references, but over a period of time, I have the potential of housing a lot of dead data. (data that is never accessed again). How to handle this is not my question.
I’m curious to find out, in your experience what is the average life span of data? That is, on average how long is data alive or good for before it gets either replaced or deleted? I understand that this is relative to the type of data you are housing, but certainly all data has some sort of life span?