views:

46

answers:

0

We're using a "1 audit table for each monitored Table" design; However, in our case emp(PARENT) table has a child table emp_address which also needs to be monitored, so we have emp_audit and emp_addr_audit tables.

postgres audit SQL : how to join PARENT and CHILD tables for reporting purposes.

/* Employee table */    
create table emp (
 emp_id integer primary key,
 empnum  integer,
 empname varchar(50),
);

/* Address table */    
create table emp_addr (
 addr_id integer primary key,
 emp_id integer, -- references table emp
 line1 varchar(30),
);

/* Audit table for emp table */    
create table emp_audit (
 operation   character(1),
 emp_id integer,
 empnum  integer,
 empname varchar(50),
 updatetime timestamp,
 txid bigint
);

/* Audit table for emp_addr table */    
create table emp_addr_audit (
 operation   character(1),
 addr_id integer,
 emp_id integer,
 line1 varchar(30),
 updatetime timestamp,
 txid bigint
);

We're using hibernate(java) for persistence and hibernate updates only those tables whose columns were modified in the update operation. Given this, I might have multiple(say, 5) rows in the emp_addr_audit table for 1 row in emp_audit table. And vice-versa as well.

The report needs 1 row for each transaction(modification). The report will have the following columns

empname, line1, operation(insert/delete/update), updatetime

Let's consider 2 scenarios to understand what's needed:

  1. In the initial transaction only emp attributes are created. Then in a separate transaction, the corresponding row in emp_addr is created. So, now, we have 1 row in emp_audit table and 1 row in emp_addr_audit table. The report will have 2 rows (one each for each transaction).
  2. Both emp and emp_addr attributes are created in a single transaction. This will ensure that there is 1 row in emp_audit and 1 row in emp_addr_audit. Now, the report will have ONLY 1 row (since both table rows were created in a single transaction).

Scenario :
Transaction #1 : I insert a row into both emp and emp_addr. This results in a row each in emp_audit and emp_addr_audit.(INSERT)
Transaction #2 : I update the above emp' attribute. This results in a UPDATE row in emp_audit.
Transaction #3 : I update the above emp_addr's attribute. This results in a UPDATE row in emp_addr_audit.

I tried the following SQL #1 and it returned 3 rows (as expected);

SQL #1

SELECT emp.*, addr.*
 FROM  emp_audit emp 
 FULL OUTER JOIN emp_addr_audit addr USING(emp_id, txid);

However, when I added a where clause to the SQL, it returns only 2 rows. The missing row was the result of Transaction #3, where only emp_addr table row was UPDATED and emp table row was untouched.
SQL #2

SELECT emp.*, addr.*
 FROM  emp_audit emp 
        FULL OUTER JOIN emp_addr_audit addr USING(emp_id, txid);
WHERE  emp.empnum = 20;

What SQL will STILL be able to get me 3 rows for the 3 transactions so that I can still filter out based on empnum ?