views:

397

answers:

3

We have an application that performs comparisons on data objects to determine if one version of the object is different than another. Our application also does some extensive caching of these objects, and we've run into a bit of a performance issue when it comes to doing these comparisons.

Here's the workflow:

  1. Data item 1 is the current item in memory. This item was initially retrieved from cache and deep cloned (all sub objects such as Dictionaries etc). Data item 1 is then edited, and its properties are modified.
  2. We are then comparing this object against the original version that was stored in cache. Since Data item 1 was cloned and its properties changed, these objects should be different.

There are a couple of issues here.

The main issue is our deep clone method is very expensive. We profiled it against a shallow clone and it was 10x slower. That's crap. Here's our method to deep clone:

    public object Clone()    
    {
        using (var memStream = new MemoryStream())
        {
            var binaryFormatter = new BinaryFormatter(null, new StreamingContext(StreamingContextStates.Clone));
            binaryFormatter.Serialize(memStream, this); 
            memStream.Seek(0, SeekOrigin.Begin);
            return binaryFormatter.Deserialize(memStream);
        }
    }

We were initially using the following to clone:

public object Clone()
{
    return this.MemberwiseClone();
}

This was more performant, but because it does a shallow clone all the complex objects that were properties of this object, such as Dictionaries etc, were not cloned. The object would still contain the same reference as the object that was in the cache, therefore the properties would be the same upon comparison.

So, does anyone have an efficient way of doing a deep clone on C# objects that would cover cloning the entire object graph?

+1  A: 

Maybe you should not deep clone then?

Other options:

1) Make your "cached" object remember its original state and make it update "changed" flag every time anything changes.

2) Do not remember original state and just flag object as dirty once anything has changed ever. Then reload object from the original source to compare. I bet your objects change less frequently than don't change, and even less frequently change back to the same value.

zvolkov
DirtyFlag? That is a crap ton of work and you'd lose automatic properties, code littered with SetIsDirty() and easy to forget to set the flag (easy to create bugs). He'd gain a lot more benefit from implementing IComparable with Sorting, etc...
Chad Grant
+3  A: 

You're not going to be able to get much better than your Generic Binary Serialization without Explicitly implementing ICloneable on all your data objects that need to be cloned. Another possible route is reflection, but you won't be happy with it either if you are searching for performance.

I would consider taking the hit with ICloneable for deep copy and/or IComparable for comparing if the objects are different ... if the performance is that big of an issue for you.

Chad Grant
+1  A: 

It is possible that my response may not apply to your case because I don’t know what your restrictions and requirement are, but my feeling would that a general purpose cloning may be problematic. As you have already encountered, performance may be an issue. Something needs to identify unique instances in the object graph and then create an exact copy. This is what the binary serializer does for you, but it also does more (the serialization itself). I am not surprised that to see that it is slower that you expected. I have similar experience (incidentally also related to caching). My approach would be to implement cloning by myself; i.e. implement IClonnable for classes which actually need to be cloned. How many classes are there in your application which you are caching? If there are too many (to manually code the cloning), would it make sense to consider some code generation?

Jan Zich