views:

737

answers:

3

I'm looking to implement data synchronization between servers and distributed clients. The data source on the server is mysql with django on top. The client can vary. Updates can take place on either client or server, and the connection between server and client is not reliable (eg. changes can be made on a disconnected cell phone, should get sync'd when the cell phone has a connection again).

S. Lott suggests using a version control design pattern in this question, which makes sense. I'm wondering if there are any existing packages / implementations of this I can use. Or, should I directly make use of svn/git/etc?

Are there other alternatives? There must be synchronization frameworks or detailed descriptions of algorithms out there, but I'm not having a lot of luck finding them. I'd appreciate if you point me in the right direction.

A: 

You may find this link useful: http://pdis.hiit.fi/pdis/download/

It is the Personal Distributed Information Store (PDIS) project download page, and lists out some relevant python packages.

tgray
A: 

Perhaps using plain old rsync is enough.

martin
A: 

AFAIK there isnt any generic solution to this mainly due to the diverse requirements for synchronization.

In one of our earlier projects we implemented a Spring batching based sync mechanism which relies on last updated timestamp field on each of the tables (that take part in sync).

I have heard about SyncML but dont have much experience with that.

If you have a single server and multiple clients, you could think of a JMS based approach. The data is bundled and placed in Queues (or topics) and would be pulled by clients.

In your case, since updates are bi-directional, you need to handle conflict detection as well. This brings additional complexities.

Sathya