views:

215

answers:

5

DB gurus,

I am hoping someone can set set me on the right direction.

I have two tables. Table A and Table B. When the system comes up, all entries from Table A are massaged and copied over to Table B (according to Table B's schema). Table A can have tens of thousands of rows.

While the system is up, Table B is kept in sync with Table A via DB change notifications.

If the system is rebooted, or my service restarted, I want to re-initialize Table B. However, I want to do this with the least possible DB updates. Specifically, I want to:

  • add any rows that are in Table A, but not in Table B, and
  • delete any rows that are not in Table A, but are in Table B
  • any rows that are common to Table A and Table B should be left untouched

Now, I am not a "DB guy", so I am wondering what is conventional way of doing this.

A: 

A simple way would be to use a historic table where you would put the changes from A that happened since the last update, and use that table to sync the table B instead of a direct copy from A to B. Once the sync is done, you delete the whole historic table and start anew.

What I don't understand is how table A can be update and not B if your service or computer is not running. Are they found on 2 different database or server?

David Brunelle
If the service maintaining Table B goes down, then it won't process any Change notifications for updates to Table A. So, when the service comes back up it needs to re-sync.
prmatta
+1  A: 

Use exists to keep processing to a minimum.

Something along these lines, modified so the joins are correct (also verify that I didn't do something stupid and get TableA and TableB backwards from your description):

insert into TableB
    select 
     *
    from
     TableA a
    where
     not exists (select 1 from TableB b where b.ID = a.ID)

delete from 
    TableB b
where
    not exists (select 1 from TableA a where a.ID = b.ID)
Donnie
I like this except, this does not handle cases where existing rows have changed between Table A and Table B. Any suggestions for that?
prmatta
Here is how I did the update:http://stackoverflow.com/questions/1829311/update-via-subquery-what-if-the-subquery-returns-no-rows
prmatta
A: 

Join data from both tables according to comon columns and this gives you the rows that have a match in both tables, i.e. data in A and in B. Then use this values (lets call this set M) with set operations, i.e. set minus operations to get the differences.

first requirement: A minus M second requrement: B minus A third requirement: M

Do you get the idea?

lewap
+1  A: 

Informix's Enterprise Replication features would do all this for you. ER works by shipping the logical logs from one server to another, and rolling them forward on the secondary.

You can configure it to be as finely-grained as you need (ie just a handful of tables).

You use the term "DB change notifications" - are you already using ER or is this some trigger-based arrangement?

If for some reason ER can't work for your configuration, I would suggest rewriting the notifications model to behave asynchronously, ie:

  • write notifications to a table in server 'A' that contains a timestamp or serial field
  • create a table on server 'B' that stores the timestamp/serial value of the last processed record
  • run a daemon process on server 'B' that:
    • compares 'A' and 'B' timestamps/serials
    • selects 'A' records between 'A' and 'B' timestamps
    • processes those records into 'B'
    • update 'B' timestamp/serial
    • sleep for appropriate time-period, and loop

So Server 'B' is responsible for ensuring its copy is in sync with 'A'. 'A' is not inconvenienced by 'B' being unavailable.

RET
ER seems like the mechanism of choice; it is automatic and hence reliable, and also has the minimum overhead. It does assume that the tables are in a logged database and that the tables have a primary key (as, of course, well designed table should have).
Jonathan Leffler
You could also look at the Change Data Capture (CDC) mechanism in IDS 11.50. Note that it helps if you tell us the version of IDS.
Jonathan Leffler
A: 

Hi, I am a Sql Server guy but since Sql Server 2008, for this kind of operation , a feature call MERGE is available.

By using MERGE statement we can perform insert, update and delete operations in a single statement.

So I googled and found that Informix also supports the same MERGE statement but I am not sure whether it takes care of delete too or not though insert and update is being taken care off. Moreover, this statement takes care of transaction by itself

priyanka.sarkar