In today’s world where a lot of computers, mobile devices or web services share data or act like hubs, syncing gets more important. As we all know solutions that sync aren’t the most comfortable ones and it’s best not to sync at all.
I’m still curious how you would implement a syncing solution to sync between multiple entities. There are already a lot of different approaches, like comparing a changed date field or a hash and using the most recent data or letting the user chose what he wants to use in a case of a conflict. Another approach is to try to automatically merge conflicted data (which in my opinion isn’t so clever, because a machine can’t guess what the user meant).
Anyway, here are a couple of questions related to sync that we should answer before starting to implement syncing:
- What is the most recent data? How do I want to represent it?
- What do I do in case of a conflict? Merge? Do I prompt and ask the user what to do?
- What do I do when I get into an inconsistent state (e.g. a disconnect due to a flakey mobile network connection)?
- What do I have to do when I don’t want to get into an inconsistent state?
- How do I resume a current sync that got interrupted?
- How do I handle data storage (e.g. MySQL database on a web service, Core Data on an iPhone; and how do I merge/sync the data without a lot of glue code)?
- How should I handle edits from the user that happen during the sync (which runs in the background, so the UI isn’t blocked)?
- How and in which direction do I propagate changes (e.g. a user creates a „Foo“ entry on his computer and doesn’t sync; then he’s on the go and creates another „Foo“ entry; what happens when he tries to sync both devices)? Will the user have two „Foo“ entries with different unique IDs? Will the user have only one entry, but which one?
- How should I handle sync when I have hierarchical data? Top-down? Bottom-up? Do I treat every entry atomically or do I only look at a supernode? How big is the trade-off between oversimplifying things and investing too much time into the implementation?
- …
There are a lot of other questions and I hope that I could inspire you enough. Syncing is a fairly general problem. Once a good, versatile syncing approach is found, it should be easier to apply it to a concrete application, rather than start thinking from scratch. I realize that there are already a lot of applications that try to solve (or successfully solve) syncing, but they are already fairly specific and don’t give enough answers to syncing approaches in general.