views:

409

answers:

3

For a project I am working on, I have been asked to create an audit trail of all changes that have been made to records. This is the first time I have had to create an audit trail, so I have been doing a lot of research on the subject.

The application will be developed in PHP/MSSQL, and will be low-traffic.

From my reading, I have pretty much decided to have an audit table and use triggers to record the changes in the table.

The two requirements for display in the application are as follows:

  1. Be able to see a log of all changes made to a field (I pretty much know how to do this)

  2. Be able to see, when viewing a record in the application, an indicator next to any field in the record that has ever been changed (and possibly other info like the date of the last change).

Item #2 is the one that is currently giving me grief. Without doing a separate query for each field (or a very long nested query that will take ages to execute), does anyone have suggestions for an optimal way to do this? (I have thought of adding an extra "ModifiedFlag" field for each field in the table, that will act as boolean indicator if the field has ever been edited, but that seems like a lot of overhead.

+3  A: 

I would treat the audit information separately from the actual domain information as much as possible.

Requirement #1: I think you will create additional audit tables to record the changes. Eric suggestion is a good one, creating the audit information using triggers in the SQL database. This way your application needs not be aware of the audit logic.

If your database does not support triggers, then perhaps you are using some kind of persistence or database layer. This would also be a good place to put this kind of logic, as again you minimize any dependencies between normal application code and the audit code.

Requirement #2: As for showing the indicators: I would not create boolean fields in the table that stores the actual. (This would cause all sorts of dependencies to exist between your normal application code and your audit trail code.)

I would try to let the code responsible for displaying the form also be responsible for showing audit data on field level. This will cause query overhead, but that is the cost for displaying this extra layer of information. Perhaps you can minimize the database overhead by adding metadata to the audit information that allows for easy retrieval.

Some big Enterprisy application that I maintain uses roughly the following structure:

  • A change header table corresponding to a change of a record in a table.

Fields:

changeId, changeTable, changedPrimaryKey, userName, dateTime

- A change field table corresponding to a field that is changed.

Fields:

changeId, changeField, oldValue, NewValue

Sample content:

Change Header:

'1', 'BooksTable', '1852860138', 'AdamsD', '2009-07-01 15:30'

Change Item:

'1', 'Title', 'The Hitchhiker's Guide to the Gaxaly', 'The Hitchhiker's Guide to the Galaxy'
'1', 'Author', 'Duglas Adasm', 'Douglas Adams'

This structure allows both easy viewing of audit trails as well as easy retrieval for showing the desired indicators. One query (inner join in the Header and Items table) would be enough to retrieve all information to show in a single form. (Or even a table when you have a list of shown Id's)

dwergkees
For 'easy retrieval' of which fields are changed, perhaps I could create a trigger which stored the initial record on creation both in the main table and in a separate 'originals' table? Then for display I could just retrieve the record from both tables ('main' and 'originals') and compare them in code to flag the appropriate fields?
Kirsehn
Thats possible, but then you would have two tables for each thing you want to store, and you can only see that a number of fields have changed since the record is created. The solution mentioned in my answer allows you to create a detailed audit trail wich can shows what fields have changed over time and what values those fields have. In other words you build up a history. Also, you do not have to create two tables for everything, just your 'main' tables and your audit trail header and item table
dwergkees
Makes sense. I probably didn't make myself clear my in my first comment. I meant that in addition to the method you suggested above (which I agree with), also create a separate table to store the original record. This would allow, I think, much easier retrieval of a simple list of fields that were changed (for marking fields as changed), than trying to retrieve if from the audit trail. I would still use the trail for displaying details of all changes.
Kirsehn
OK that makes sense from both a performance view and ease of development!
dwergkees
+1  A: 

As a general requirement flagging changed field "smells" slightly odd. If records are long lived and subject to change over time then eventually all fields will tend to get so flagged. Hence I wonder how any user could make sense of a simple set of indicators per field.

That line of thinking makes me suspect that the data you store needs to be, as you've described, a true audit trail with all the changes recorded, and the first real challenge is to decide how the info should be presented to the user.

I think your idea of preparing some kind of aggregateOfTheAuditTrail data is likely to be very useful. The question would be is a single flag per record enough? If the User's primary access is through list then maybe it's enough just to highlight the changed records for later drill down. Or a date of last change of the record value, so that only recently changed records are highlighted - all back to what the user's real needs are. I find it hard to imagine that records changed 3 years ago are as intersting as those changed last week.

Then when we come to the drill down to a single record. Again a simple flag per field doesn't sound useful (though your domain, your requirements). If it is, then your summary idea is fine. My guess is that a sequence of changes to a field, and the sequence of overall changes to the record, are much more interesting. Employee had pay rise, employee moved department, employee was promoted = three separate business events or one?

If anything more than a simple flag is needed then I suspect that you just need to return the whole (or recent) audit trail for the record and let the UI figure out how to present that.

So, my initial thought: Some kind of rolling-maintenance of a summary record sounds like a good idea. If necessary maintained in background threads or batch jobs. We deisgn that to be business-useful without going to the full audit trail each time. Then for detailed analyses we allow some or all of the trail to be retrieved.

djna
Some more explanation of the application is probably in order. The records, although they will likely be archived, are only 'active' for a year. It is an application to store goals for the year, which get entered at the beginning of the year after they are approved by management. The goal owners can then go in and modify the status of any goal, but if they make any changes to the initial goal data, it needs to be 'flagged' somehow. Basically, it is a simplified project management system ;)
Kirsehn
A: 

Personally, I'd make the tracking simple, and the reporting funky.

Each time a user inserts a record, you make a insert into the audit table for that table

'I', 'Date', 'User', 'Data column1','Data Column2', etc.

That is assuming the structure of the tables won't change over time (re. the amount of datacolumns)

For updates, just insert

'U', 'Date', 'User', 'Data column1', etc

Insert what the user just entered as an update.

Then, after the insert and update, you will have the following

'I','May 3 2009','BLT','person005','John','Smith','Marketing'
'U','May 4 2009','BLT','person005','John','Smith','Accounting'

Then, it's just an easy report to show that the unique person record 'person005' has had an insert and an update, where their department was updated.

Due to the low usage of the system, having a simple insert on changing then a more complex reporting process isn't going to effect performance. This style will still work with higher traffic systems, as the extra load on a edit is minimal, whereas the higher intensity workload of reporting back the changes isn't done as often as an update, so the system won't fall over.

glasnt