Synchronizing one or more databases with a master database - Foreign keys

views:

232

answers:

+2 Q:

Synchronizing one or more databases with a master database - Foreign keys

I'm using Google Gears to be able to use an application offline (I know Gears is deprecated). The problem I am facing is the synchronization with the database on the server.

The specific problem is the primary keys or more exactly, the foreign keys. When sending the information to the server, I could easily ignore the primary keys, and generate new ones. But then how would I know what the relations are.

I had one sollution in mind, bet the I would need to save all the pk for every client. What is the best way to synchronize multiple client with one server db.

Edit:
I've been thinking about it, and I guess seqential primary keys are not the best solution, but what other possibilities are there? Time based doesn't seem right because of collisions which could happen.

A GUID comes to mind, is that an option? It looks like generating a GUID in javascript is not that easy.

I can do something with natural keys or composite keys. As I'm thinking about it, that looks like the best solution. Can I expect any problems with that?

I've came up with the following solution:

Every client gets a unique id from the server. Everywhere a primary key is referenced, I use a composite key with the client id and an auto increment field.

This way, the combination is unique, and it's easy to implement. The only thing left is making sure every client does get a unique id.

I just found out one drawback: SQLite doesn't support autoincrement on composite primary keys, so I would have to handle the id's myself.

Ikke 2010-03-06 13:39:32

+2 A:

This is not quite a full answer, but might at least provide you with some ideas...

The question you're asking (and the problem you're trying to address) is not specific to Google Gears, and will remains valid with other solutions, like HTML 5 or systems based or Flash/Air.

There's been a presentation about that subject given during the last ZendCon a few month ago -- and the slides are available on slideshare : Planning for Synchronization with Browser-Local Databases

Going through thoses slides, you'll see notes about a couple of possibilities that might come to mind (some did actually come to your mind, or in other answers) :

Using GUID
Composite Keys
Primary key pool (i.e. reserve a range of keys beforehand)

Of course, for each one of those, there are advantages... and drawbacks -- I will not copy-paste them : take a look at the slides ;-)

Now, in your situation, which solution will be the best ? Hard to say, actually -- and the sooner your think about synchronisation, the better/easier it'll probably be : adding stuff into an application is so much simpler when that application is still in its design stage ^^

First, it might be interesting to determine whether :

Your application is generally connected, and being dis-connected only rarely happens
Or if your application is generally dis-connected, and only connects once in a while.

Then, what are you going to synchronise ?

Data ?
- Like "This is the list of all commands made by that user"
- With that data replicated on each dis-connected device, of course -- which can each modify it
- In this case, if one user deletes a line, and another one adds a line, how to know which one has the "true" data ?
Or actions made on those data ?
- Like "I am adding an entry in the list of commands made by that user"
- In this case, if one user deletes a line, and another one adds a line, it's easy to synchronize, as you just have to synchronise those two actions to your central DB
- But this is not quite easy to implements, especially for a big application / system : each time an action is made, you have to kind of log it !

There is also a specific problem to which we don't generally think -- until it happens : especially if your synchronisation process can take some time (if you have a lot of data, if you don't synchronise often, ...), what if the synchronisation is stopped when it's not finished yet ?

For instance, what if :

A user, in a train, has access to the network, with some 3G card
The synchronisation starts
there is a tunnel -- and the connection is lost.

Having half-synchronised data might not be that good, in most situations...

So, you have to find a solution to that problem, too : in most cases, the synchronisation has to be atomic !

Pascal MARTIN 2010-03-12 22:45:40

Too bad I couldn't come online to accept an answer on time. The main use-case is that clients are online most of the time, but can be disconnected sometimes. But thanks for this post, it gives a lot of things to think about.

Ikke 2010-03-13 14:24:55

You're welcome :-) Have fun ! -- mostly connected users can make things a bit easier, in some cases : it means less data to synchronise ^^

Pascal MARTIN 2010-03-13 14:26:34

I would use a similar setup to your latest answer. However, to get around your auto-increment issue, I would use a single auto-increment surrogate key in your master database and then store the client primary key and your client id as well. That way you are not losing or changing any data in the process and you are also tracking which client the data was originally sourced from.

Be sure to also set up a unique index on your Client Pk, Client Id to enable referential integrity from any child tables.

Leigh Shayler 2010-03-13 04:06:39

Is there a reasonable limit to how many objects the client can create while disconnected? One possibilty I can see is to create a sort of "local sequence".

When your client connects to the central server, it gets a numeric ID, say a 7 digit number (the server generates it as a sequence).

The actual PKs are created as strings like this: 895051|000094 or 895051|005694 where the first part is the 7 digit number sent from the server, and the second part is a "local" sequence managed by the client.

As soon as you synch with the central, you can get a new 7 digit number and restart your local sequence. This is not too different from what you were proposing, all in all. It just makes the actual PK completely independant from the client identity.

Another bonus is that if you have a scenario where the client has never connected to the server, it can use 000000|000094 locally, require a new number from the server and update the keys on its side before sending back to the server for synch (this is tricky if you have lots of FK constraints though, and could not be feasible).

p.marino 2010-03-13 12:29:19

ansaurus

tags:

views:

answers:

Synchronizing one or more databases with a master database - Foreign keys

related questions