tags:

views:

67

answers:

3

I have a List of Java objects on my server which is sent to the client through some serialization mechanism. Once in a while the List of objects gets updated on the server, that is, some objects get added, some get deleted and others just change their place in the List. I want to update the List on the client side as well, but send the least possible data. Especially, I don't want to resend Objects which are already available on the client.

Is there a library available which will produce some sort of diff from the two lists, so that I can only send the difference and the new Objects accross the wire?

I have found several Java implementation of the unix diff command, but this algorithm is unpractical for order changes. ie. [A,B,C] -> [C,B,A] could be sent as only place changes [1->3] [3->1], while diff will want to resend the whole A and C objects (as far as I understand).

+3  A: 

I would do this by making the public interface of the objects wherever they are modified silently keep a log of changes made, that is, add an object representing each modification to a list of modifications.

That way you have a minimal list of the exact changes to send to the other machine, rather than needing to infer them using fallible guesswork by comparing old versus new.

To create the object model so that it automatically records changes to itself, you will likely benefit from some code generation or AOP to avoid a lot of repetitive patterns. Methods that set the value of a property, or add/remove from lists, all need to call into a central log shared by the object hierarchy.

Daniel Earwicker
Good approach, but this solution is not minimal without a reduction step. For ex. if you add and later remove an element, the element should not be sent over the wire. (BTW sorry for the late response)
Philipp
+3  A: 

You can "pretend" that your list is a string, and use Damerau–Levenshtein distance to find the minimum operations necessary to transform one to another, allowing insertion, deletion, substitution, and transposition (which is what your example suggests).

I'm not aware of a mature and/or stable implementation, and even if one exists, it's likely targeted for strings, so adapting to a list of abstract value types would be a challenge. Implementing your own is also likely to be a challenging task, but it's certainly doable.

polygenelubricants
Thanks for the reference. The implementation of the D-L algo which I found on the web do not seem reliable and writing my own is a bit overkill for my problem.
Philipp
A: 

For now I'll just send the complete List over the wire but instead of the objects, I use only a unique ID. If the client does not have the object locally, it requests it using the ID.

This is certainly less beautiful than an optimal algorithm but has the expected result: expensive objects are only sent once over the wire.

Philipp