views:

145

answers:

3

There are two master databases and two read-only copies updated by standard transactional replication. It is needed to map some entity from both read-only databases, lets say that A databases contains orders and B databases contains lines.

Replication

The problem is that replication to one database can lag behind replication of second database, and at the moment of mapping R-databases will have inconsistent data. For example.

We stored 2 orders with lines at 19:00 and 19:03. Mapping process started at 19:05, but to the moment of mapping A database replication processed all changes up to 19:03, but B database replication processed only changes up to 19:00. After mapping we will have order entity with order as of 19:03 and lines as of 19:00. The troubles are guaranteed:)

In my particular case both databases have temporal model, so it is possible to fetch data for every time slice, but the problem is to identify time of latest replication.

Question: How to synchronize replication processes for several databases to avoid situation described above? Or, in other words, how to compare last time of replication in each database?

UPD:

The only way I see to synchronize is to continuously write timestamps into service tables in each database and to check these timestamps on replicated servers. Is that acceptable solution?

+1  A: 

You can't. Your only solution is to have only one master. You should never split an application into different databases, as you already have the problem even without replication. You cannot restore both A and B to a consistent state. The database is your unit of recovery, and it should never be split into separate entities.

Your best option is two go back to the drawing board and redesign your application that it only keeps state in only one database, like all applications should. If you can't accomplish this, then you're going to have to give up consistency on the replicas.

Remus Rusanu
I know that there are lots of problems, but unfortunately I can't redesign the system because it is already in live about 10 years and there are about 1000 of developers developing applications around it.
Yauheni Sivukha
One alternative is if you can do a logical replication (operations), instead of physical (tables). Eg. on the master a transaction writes an invoice header in A and an invoice details in B. Instead of replicating the two inserts, you replicate the operation of writing an invoice with the given header and details, as one single, atomic operation. Replication can do this if the operation is done through a stored proc: it can replicate the *invocation* as opposed to its effect. Other solution could be to ship an SSB message. But it is quite hard to implement and maintain.
Remus Rusanu
Again you are talking about the things I can not affect. I already have tons of applications writing in the master and replicated databases, from which I have to read. The only thing I can do is to inject some kind of synchronization mechanism. As for SSB message, how it can help? There is still possible situation when the data is in master, SSM message sent, but data is unavalable in replica.
Yauheni Sivukha
+1  A: 

It seems, that given task can't be solved in given constraints. If I understood correctly, number of databases and row's schema are constants.

So, variables that left:

  • Additional "injections" to database
  • Temporal Tricks
  • Triggers Tricks
  • "Late binding" of changes, that replicated not in time

Currently, I have found only one idea, that seems to work:

  1. Add a trigger on "Lines" table, to modify "Order" record time-stamp (last_line_time)
  2. In replica, wait until a Line with time, equals to last_line_time to appear.
    • If max(lines.line_time) > order.last_line_time than order is obsolete
    • If max(lines.line_time) < order.last_line_time than lines are obsolete
    • If max(lines.line_time) == order.last_line_time than everything is OK, for now :)

But, this case can fail into infinite loop, if Lines are constantly modified, and Lines table replica always lag behind.

Valera Kolupaev
This approach requires significant development from both database and application side. However it solves the problem, thanks.
Yauheni Sivukha
A: 

Why dont you create views joining tables with appropriate states from Database A & Database B in say Database C which will have synchronised data and then replicate them ? I think this way you would have consistent data.

Baaju
Good point. But the replication of source databases is intended to avoid huge load while fetching big amounts of data. It your case db A and db b are still under load. (I would like to highlight that this is not my decision, but this is what I have to put up with)
Yauheni Sivukha
maybe you could go for "indexed" views which may relatively reduce the load on db A and db B.
Baaju