views:

88

answers:

3

I have a users table that has the following fields: userid, phone, and address. Since this is user data, I'm letting the user change them whenever he wants. Problem is I'd like to keep track of those changes and preserve the old data too. Here's some of the ideas I considered:

  • appending the new data to the old data and using a separator like a pipe. When retrieving the field, I would check for the existence of that separator and if exists, get the chars after it as the new data. (feels cumbersome and doesn't feel right)

  • setting up a different changes table with the following fields: userid, fieldname, fieldcontent. When/if a user changes data (any data), I would log the event in this separate table under the user's userid, and the name/id of the field and the old content of the field, then I can now overwrite his old data in users with the new. If I want to find all changes made by this user, I would search the changes table by his userid. Problem with this is that I'm mixing all data changes (of all fields) into one table and so the fieldcontent field in changes has to be text to accommodate the varying field types. This still seems better than the first idea, but still not sure if I'm doing the right thing.

What other ideas are there or known best practices to keep old data?

Thanks in advance

+3  A: 

Whatever you do don't do the first one.

The changes table is a better approach. It's also called an audit or history table. I wouldn't do a history of key-value pairs however. Instead do a history per relevant table. You can do this in application code or via database triggers. Basically whenever an insert, update or delete happens you record which happened and what data was changed.

Table user:

  • id
  • username
  • email address
  • phone
  • address

Table user_history:

  • id
  • change_type (I, U or D for insert, update or delete)
  • user_id (FK user.id)
  • email address
  • phone
  • address
  • date/time of change
  • optionally, also store who changed the record
cletus
Ok, this info is good enough for now, will take it and run with it from here.
Chris
+1  A: 

A very simple way that we have used to track such changes is this:

users_history` 
    userid 
    changenumber smallint not null
    changedate datetime not null
    changeaddr varchar(32) not null
    phone NULL,
    address NULL

    primary key on (userid, linenumber)

Each time you INSERT or UPDATE a record in the users table, simply INSERT a new record in the users_history table. changenumber starts at 1 and increments from there. changedate and changeaddr could be used to track when and where.

If a field value has not changed, feel free to put NULL in the respective users_history table field.

At the end of the day, your app does not need to change or store bulky history data in the users table, but you have all if it at your fingertips.

Edit:

This does preserve the old data. See the following example where the user started with a given address and phone, and then 4 days later updated the address, and 5 days later updated the phone. You have everything.

Current users record:

100                            |  234-567-8901   |   123 Sesame Street


Sample History Table

100   |  1  | 2009-10-01 12:00 |  123-456-7890   |   555 Johnson Street
100   |  2  | 2009-10-05 13:00 |  NULL           |   123 Sesame Street
100   |  3  | 2009-10-10 15:00 |  234-567-8901   |   NULL
gahooa
thanks, +1 too for the response.
Chris
but after re-reading it, I don't think it accomplishes what I want, since I really do want to store the old data.
Chris
Hi Chris. Please see the edit, as it does store the old data, new data, and all the in-between data.
gahooa
A: 

The simplest way to implement this will be have another table just for history purpose, a snapshot. You don't need to mirror all the fields, just

change_id // row id (just for easy management later on if you need to delete specific row, otherwise its not really necessary)
user_id // Original user id
change_time // time of change
data // serialized data before change.
Darkerstar
If are doing this, you might also want to put this on a different server, since you don't really need to correlate any data from this snapshots table with the user apart from the user_id which you simply pass. If you have many users, this could grew very quickly.
Darkerstar