views:

218

answers:

2

Greetings,

Just a bit of backgroung information - I am currently working on a relatively large desktop-based requirements management utility, and have come across an issue for which I believe I know the solution, but am hesitant to implement without further thought.

Basically, at anytime during the usage of this requirements management tool, someone can do a baseline (or label, for people more familiar with that term) of a document or directory of documents. Doing so requires that basically I make a maintain a copy of the entire active document or folder of documents, including any settings associated with each doc, traceability links and actual content (which is limited only to formatted text) so that users can come back later and look at it

I had originally considered an SVN approach of tracking edits in a table and upon on a baseline, creating a link to the latest copy of data that the baseline covered at that time...however the complexity of just overwhelmed me.

My second consideration was to simply make a copy of the table data associated with the active document or folder of documents that the baseline was applied to, and store that for safe keeping. The code complexity savings would be enormous, the schema would be kept simpler though there is some impact in terms of storage usage (though I do not believe it will be that big)

Anyway - Am I sane in thinking this way? I'm trying to develop this application with only a DBMS as my server-side application and a moderate number of clients that interact with it.

If I am sane, then is there any optimal route for storing duplicate data in the same database (creating a new database for simply a label is not what I would be looking towards). A set of copied tables perhaps, though for some reason, that seems just clumsy?

+2  A: 

This is something I'm faced with as well on a current project. Some good answers here.

I'm personally using option #1, adding a version column to the pkey of each "versioned" table. FKeys use the version column as well. My main consideration was ease of querying, with a consequence of high storage requirements. I have the leeway to strictly limit the number of versions kept around, which mitigates that issue.

Rich Rodriguez
+1  A: 

I really think you are looking for a temporal database. Our own implementation added a pair of audit log identifiers to a table, the "create ID" and "delete ID" (and every transaction that modified the database was summarized in an "audit log" and given a unique sequential identifier). For each transaction that added a record to the table, the audit log identifier was used as the create ID and the delete ID was zero. When a record changed, the new version had the new create ID and the same identifier was used as the delete ID for the old version.

It was quite easy to create a view of a table where the delete ID was zero, creating a virtual table of just current data. You could also find the content of the tables just before or just after a specific transaction was processed, get a list of the transactions that modified a specific row of a table, etc.

There are time, space, and complexity penalties for all this. But for us, dealing with large sums of money, it was worth it.

Joel