views:

1148

answers:

7

First, we currently have the behavior that's desired, but it's not trivial to maintain when any changes to the database are needed. I'm looking for anything simpler, more efficient, or easier to maintain (anything that does any of those 3 would be most welcome). When we perform an update, a history row is created that is a copy of the current row, and the current row's values are then updated. The result being that we have a history record of how the row was before it was updated.

Reasoning: We have to be compliant with a number of federal rules, and went this route to have a full audit history of everything, as well as we can look at the database at any point in time and see how things looked (future requirement). For similar reasons, I cannot change how history is recorded...any solution must result in the same data as the current triggers create.

Here's what the current triggers look like for the Contact Table:
(stripped useless fields for brevity, the number of fields doesn't matter)

Before update (each row):

DECLARE
     indexnb number;
BEGIN
  :new.date_modified := '31-DEC-9999';
  indexnb := STATE_PKG.newCONTACTRows.count + 1;
  :new.date_start := sysdate;
  :new.version := :old.version + 1;
  state_pkg.newCONTACTRows(indexnb).ID := :old.ID;
  state_pkg.newCONTACTRows(indexnb).PREFIX := :old.PREFIX;
  state_pkg.newCONTACTRows(indexnb).FIRST_NAME := :old.FIRST_NAME;
  state_pkg.newCONTACTRows(indexnb).MIDDLE_NAME := :old.MIDDLE_NAME;
  state_pkg.newCONTACTRows(indexnb).LAST_NAME := :old.LAST_NAME;
  --Audit columns after this
  state_pkg.newCONTACTRows(indexnb).OWNER := :old.OWNER;
  state_pkg.newCONTACTRows(indexnb).LAST_USER := :old.LAST_USER;
  state_pkg.newCONTACTRows(indexnb).DATE_CREATED := :old.DATE_CREATED;
  state_pkg.newCONTACTRows(indexnb).DATE_MODIFIED := sysdate;
  state_pkg.newCONTACTRows(indexnb).VERSION := :old.VERSION;
  state_pkg.newCONTACTRows(indexnb).ENTITY_ID := :old.id;
  state_pkg.newCONTACTRows(indexnb).RECORD_STATUS := :old.RECORD_STATUS;
  state_pkg.newCONTACTRows(indexnb).DATE_START := :old.DATE_START;
END;

Before update (once for all rows):

BEGIN
  state_pkg.newCONTACTRows := state_pkg.eCONTACTRows;
END;

After update (once for all rows):

DECLARE
BEGIN
  for i in 1 .. STATE_PKG.newCONTACTRows.COUNT loop
    INSERT INTO "CONTACT" (
      ID, 
      PREFIX, 
      FIRST_NAME, 
      MIDDLE_NAME, 
      LAST_NAME, 
      OWNER, 
      LAST_USER, 
      DATE_CREATED, 
      DATE_MODIFIED, 
      VERSION, 
      ENTITY_ID, 
      RECORD_STATUS, 
      DATE_START)
    VALUES (
      CONTACT_SEQ.NEXTVAL, 
      state_pkg.newCONTACTRows(i).PREFIX,
      state_pkg.newCONTACTRows(i).FIRST_NAME,
      state_pkg.newCONTACTRows(i).MIDDLE_NAME,
      state_pkg.newCONTACTRows(i).LAST_NAME,
      state_pkg.newCONTACTRows(i).OWNER,
      state_pkg.newCONTACTRows(i).LAST_USER,
      state_pkg.newCONTACTRows(i).DATE_CREATED,
      state_pkg.newCONTACTRows(i).DATE_MODIFIED,
      state_pkg.newCONTACTRows(i).VERSION,
      state_pkg.newCONTACTRows(i).ENTITY_ID,
      state_pkg.newCONTACTRows(i).RECORD_STATUS,
      state_pkg.newCONTACTRows(i).DATE_START
    );
  end loop;
END;

The package defined as (trimmed, full version is just copy of this per table):

PACKAGE STATE_PKG IS
  TYPE CONTACTArray IS TABLE OF CONTACT%ROWTYPE INDEX BY BINARY_INTEGER; 
  newCONTACTRows CONTACTArray; 
  eCONTACTRows CONTACTArray;
END;

The current result

Here's a resulting history sample:

ID    First Last   Ver  Entity_ID  Date_Start              Date_Modified  
1196  John  Smith  5    0          12/11/2009 10:20:11 PM  12/31/9999 12:00:00 AM
1201  John  Smith  0    1196       12/11/2009 09:35:20 PM  12/11/2009 10:16:49 PM
1203  John  Smith  1    1196       12/11/2009 10:16:49 PM  12/11/2009 10:17:07 PM
1205  John  Smith  2    1196       12/11/2009 10:17:07 PM  12/11/2009 10:17:19 PM
1207  John  Smith  3    1196       12/11/2009 10:17:19 PM  12/11/2009 10:20:00 PM
1209  John  Smith  4    1196       12/11/2009 10:20:00 PM  12/11/2009 10:20:11 PM

Each history record has an Entity_ID that's the ID of the current row, the Date_Start on the new record matches the Date_Modified of the last history row. This allows us to do queries like Where Entity_ID = :id Or ID = :id And :myDate < Date_Modified And :myDate >= Date_Start. History can be fetched by Entity_ID = :current_id.

Is there a better approach, hopefully more maintainable/flexible to do this? The concept is simple, when updating a row, copy it to the same table via an insert with the old values, then update the current row...but actually doing that, I have yet to find a simpler way. I'm hoping someone much trickier/wiser in Oracle has a better approach to this. Speed doesn't matter much, we're 99% reads 1% writes like most web applications, and all bulk operations are inserts, not updates which wouldn't create any history.

If anyone has any ideas to simplify the maintenance on this, I'd be extremely appreciative, thanks!

+2  A: 

Unfortunately there is no way to avoid referencing all the column names (:OLD.this, :OLD.that, etc.) in triggers. However, what you could do is write a program to generate the trigger code from the table definition (in USER_TAB_COLS). Then whenever the table is changed you can generate and compile a fresh copy of the triggers.

See this AskTom thread for how to do that.

Tony Andrews
@Tony - This is what we currently do, running that and syncing up multiple copies of the database is a real nightmare...hoping someone knows an alternative, something like `select * into temp`, update only the few audit fields, do an insert....I'm not sure how to work this out though, Oracle's not my primary field :(
Nick Craver
+3  A: 

Okay, this is a rewrite. What I missed when I first responded is that the application is storing its History in the main table. Now I understand why @NickCraver is so apologetic about the code.

Well the first thing to do is to hunt down the perpetrators of this design and make sure they never do it again. Storing history like this doesn't scale, makes normal (non-historical) queries more complicated and sabotages relational integrity. Obviously there are scenarios where none of that matters, and perhaps your site is one of them, but in general this is a very bad implementation.

The best way of doing this is Oracle 11g Total Recall. It's an elegant solution, with a completely invisible and effcient implementation, and - by the standards of Oracle's other chargeable extras - quite reasonably priced.

But if Total Recall is out of the question and you really must do it this, don't allow updates. A change to an existing CONTACT record should be an insert. In order to make this work you may need to build a view with an INSTEAD OF trigger. It's still yucky but not quite as yucky as what you have now.

APC
You're saying do this with combining just `:new.date_modified := '31-DEC-9999'; :new.date_start := sysdate; :new.version := :old.version + 1;` in a single trigger? This may work...there was a reason we couldn't do single triggers before but it's one of the situations it'd so long ago I can't remember what the reason was, let me try your approach and see what happens. Thanks!
Nick Craver
I was thinking that, but isn't there a danger of "table is mutating" with a multi-row insert? Or is that OK these days - I can't remember.
Tony Andrews
@APC - I tried this out...Tony is correct, this produces a `ORA-04091: table CONTACT is mutating, trigger/function may not see it`, this is why we couldn't go this route before...took seeing it to remember, like most ancient things.
Nick Craver
11g isn't out of the question, but we're on 10 at the moment and can't upgrade without fair notice. For the scaling, it's a matter of least complication in all layers. We often reference the history in Linq-to-Oracle, and `DB.Contacts.Current()` vs `DB.Contacts` makes it easy to access history in a generic way up top...that's the reasoning behind it. The partitioning based on Entity_Id (=0, !=0) keeps performance good in our case....but, the system's not huge yet. It's possible a better solution could be reached by changing the database history and linq layers, but no time currently.
Nick Craver
@APc - Your original solution with some modifications worked, there's no reason we can't use anonymous transactions for this purpose, I hadn't come across that work-around before, thanks for making me take a harder look at the simpler approach!
Nick Craver
A: 

If you want to develop a generic solution, you might want to take a look at DBMS_SQL package. With it you could develop a package/procedure that takes a table name as input and builds the updates based on that, by examining the table structure in the dictionary and building the updates on the fly. It would be non trivial up-front development, but a lot less maintenance in the future, since if a table structure changes, the code would sense that and adapt. This method would work for any table that you care to use it with.

DCookie
A: 

Depending on the complexity of your database (number of tables, size, depth of PK/FK relationships, other logic in triggers), you might want to look at Oracle Workspace Management. You make an API call to put a table under workspace management which results in Oracle replacing the table with an updatable view and other corresponding objects that maintain a history of all versions of the rows.

I've used this and while there are disadvantages, one advantage for auditing is that the code objects are all generated by Oracle and their correctness is generally assumed.

dpbradley
I looked at the workspace mgmt tool. It's really sweet but I was told we need a separate license for that and they don't want to pay for it.
FALCONSEYE
I'm not sure about the lower end Oracle products, but WM is part of the Enterprise Edition.
dpbradley
A: 

The only time I might recommend that historical records be stored in the same table as the "current" records is when FK links to the records must or might need to link to them. For example, one application I've seen had some FK links that would link to the record as of a "point in time", that is, if the record was updated, the FK would still link to the historical record - this was an important part of the design and separating history records into a second table would have made it more unwieldy.

Apart from that, I'd prefer that a business requirement for tracking all changes should be solved using a separate "history" table for each table. Sure, it means more DDL, but it simplifies the application code enormously and you'll also benefit from better performance and scalability.

Jeffrey Kemp
Actually, in our case a separate history table complicates the application code much more...we access it via Linq and accessing history or current is very quick/easy (handled in a handful of generic operators/extension methods). There's very little code filtering...but the reverse of fetching history from another source would be tremendous **in our specific case**. I agree with you in general of course, we happen to have a very specialized use/situation that it fits better...but of course I'm always open to alternatives if the **overall** solution is simpler, easier, or more efficient.
Nick Craver
+1  A: 

In case someone has the same highly specialized case we do (Linq access making single table history much cleaner/easier, this is what I ended up doing to simplify what we have, welcome any improvements....this is just a script that will run whenever the database changes, regenerating the audit triggers, the main change being PRAGMA AUTONOMOUS_TRANSACTION; placing the history generating on an autonomous transaction and not caring about mutation (which doesn't matter for how we audit):

Declare
  cur_trig varchar(4000);
  has_ver number;
Begin
    For seq in (Select table_name, sequence_name 
              From user_tables ut, user_sequences us
              Where sequence_name = replace(table_name, '_','') || '_SEQ'
                And table_name Not Like '%$%'
                And Exists (Select 1
                            From User_Tab_Columns utc
                            Where Column_Name = 'ID' And ut.table_name = utc.table_name)
                And Exists (Select 1
                            From User_Tab_Columns utc
                            Where Column_Name = 'DATE_START' And ut.table_name = utc.table_name)
                And Exists (Select 1
                            From User_Tab_Columns utc
                            Where Column_Name = 'DATE_MODIFIED' And ut.table_name = utc.table_name))
    Loop
     --ID Insert Triggers (Autonumber for oracle!)
     cur_trig := 'CREATE OR REPLACE TRIGGER ' || seq.table_name || 'CR' || chr(10)
              || 'BEFORE INSERT ON ' || seq.table_name || chr(10)
              || 'FOR EACH ROW' || chr(10)
              || 'BEGIN' || chr(10)
              || '  SELECT ' || seq.sequence_name || '.NEXTVAL INTO :new.ID FROM DUAL;' || chr(10)
              || '  IF(:NEW.ENTITY_ID = 0) THEN' || chr(10)
              || '    SELECT sysdate, sysdate, ''31-DEC-9999'' INTO :NEW.DATE_CREATED, :NEW.DATE_START, :NEW.DATE_MODIFIED FROM DUAL;' || chr(10)
              || '  END IF;' || chr(10)
              || 'END;' || chr(10);

     Execute Immediate cur_trig;

     --History on update Triggers
     cur_trig := 'CREATE OR REPLACE TRIGGER ' || seq.table_name || '_HIST' || chr(10)
              || '  BEFORE UPDATE ON ' || seq.table_name || ' FOR EACH ROW' || chr(10)
              || 'DECLARE' || chr(10)
              || '  PRAGMA AUTONOMOUS_TRANSACTION;' || chr(10)
              || 'BEGIN' || chr(10)
              || '  INSERT INTO ' || seq.table_name || ' (' || chr(10)
              || '   DATE_MODIFIED ' || chr(10)
              || '   ,ENTITY_ID ' || chr(10);

       For col in (Select column_name
                 From user_tab_columns ut
                 Where table_name = seq.table_name
                   And column_name NOT In ('ID','DATE_MODIFIED','ENTITY_ID')
                 Order By column_name)
     Loop
       cur_trig := cur_trig || '   ,' || col.column_name || chr(10);
     End Loop;

     cur_trig := cur_trig || ') VALUES ( --ID is Automatic via another trigger' || chr(10)
                          || '   SYSDATE --DateModified Set' || chr(10)
                          || '   ,:old.ID --EntityID Set' || chr(10);

     has_ver := 0;
       For col in (Select column_name
                 From user_tab_columns ut
                 Where table_name = seq.table_name
                   And column_name NOT In ('ID','DATE_MODIFIED','ENTITY_ID')
                 Order By column_name)
     Loop
       cur_trig := cur_trig || '   ,:old.' || col.column_name || chr(10);
       If Upper(col.column_name) = 'VERSION' Then 
         has_ver := 1; 
       End If;
     End Loop;

     cur_trig := cur_trig || ');' || chr(10)
                          || ':new.DATE_MODIFIED := ''31-DEC-9999'';' || chr(10)
                          || ':new.DATE_START := SYSDATE;' || chr(10);
     If has_ver = 1 Then
       cur_trig := cur_trig || ':new.version := :old.version + 1;' || chr(10);
     End If;
     cur_trig := cur_trig || 'COMMIT;' || chr(10)
                          || 'END;' || chr(10);

     Execute Immediate cur_trig;
    End Loop;
End;
/

If you can improve, feel free...I've only written a handful of PL/SQL scripts, the need doesn't arise often...probably a lot left to be desired there.

Answer credit to APC for getting me to look at this a bit harder. I don't recommend this history layout unless it its the rest of your model/application/stack extremely well. For this application, we constantly show a mix of history and current, and filtering is far simpler than combining when it comes to a Linq-to-SQL style access. Thanks for all the answers guys, all good suggestions...and when I have more time and am not crunched by a release schedule, this is something I'll revisit to see if it can be improved further.

Nick Craver
A: 

I understand your specifc application requirements to have the history and current values in the same table, but perhaps this could be handled by going down the more usual route of having a separate audit table but building it up as a pseudo-materialized view to present a combined view for the application.

For me, this has the advantage of having a simple "current" view and a separate but completely automated "audit" view (which in this case also has the current view).

Something like:

create sequence seq_contact start with 1000 increment by 1 nocache nocycle;

create table contact (
    contact_id integer,
    first_name varchar2(120 char),
    last_name varchar2(120 char),
    last_update_date date
    );

alter table contact add constraint pk_contact primary key (contact_id);

create table a$contact (
    version_id integer,
    contact_id integer,
    first_name varchar2(120 char),
    last_name varchar2(120 char),
    last_update_date date
    );

alter table a$contact add constraint pk_a$contact primary key
        (contact_id, version_id);

create or replace trigger trg_contact
before insert or delete or update on contact 
for each row
declare

    v_row contact%rowtype;
    v_audit a$contact%rowtype;

begin

    select seq_contact.nextval into v_audit.version_id from dual;

    if not deleting then

        :new.last_update_date := sysdate;

    end if;

    if inserting or updating then

        v_audit.contact_id := :new.contact_id;
        v_audit.first_name := :new.first_name;
        v_audit.last_name := :new.last_name;
        v_audit.last_update_date := :new.last_update_date;

    elsif deleting then

        v_audit.contact_id := :old.contact_id;
        v_audit.first_name := :old.first_name;
        v_audit.last_name := :old.last_name;
        v_audit.last_update_date := sysdate;

    end if;

    insert into a$contact values v_audit;

end trg_contact;
/

insert into contact (contact_id, first_name, last_name) values
    (1,'Nick','Pierpoint');

insert into contact (contact_id, first_name, last_name) values
    (2, 'John', 'Coltrane');

insert into contact (contact_id, first_name, last_name) values
    (3, 'Sonny', 'Rollins');

insert into contact (contact_id, first_name, last_name) values
    (4, 'Kenny', 'Wheeler');

update contact set last_name = 'Cage' where contact_id = 1;

delete from contact where contact_id = 1;

update contact set first_name = 'Zowie' where contact_id in  (2,3);

select * from a$contact order by contact_id, version_id;

VERSION_ID  CONTACT_ID  FIRST_NAME  LAST_NAME  LAST_UPDATE_DATE
1000        1           Nick        Pierpoint  11/02/2010 14:53:49
1004        1           Nick        Cage       11/02/2010 14:54:00
1005        1           Nick        Cage       11/02/2010 14:54:06
1001        2           John        Coltrane   11/02/2010 14:53:50
1006        2           Zowie       Coltrane   11/02/2010 14:54:42
1002        3           Sonny       Rollins    11/02/2010 14:53:51
1007        3           Zowie       Rollins    11/02/2010 14:54:42
1003        4           Kenny       Wheeler    11/02/2010 14:53:53
Nick Pierpoint