views:

1505

answers:

8

I wish to implement a deepcopy of my classes hierarchy in C#

public Class ParentObj : ICloneable
{
    protected int   myA;
    public virtual Object Clone ()
        {
             ParentObj newObj = new ParentObj();
             newObj.myA = theObj.MyA;
             return newObj;
        }
}

public Class ChildObj : ParentObj
{
    protected int   myB;
    public override Object Clone ( )
        {
             Parent newObj = this.base.Clone();
             newObj.myB = theObj.MyB;

             return newObj;
        }
}

This will not work as when Cloning the Child only a parent is new-ed. In my code some classes have large hierarchies.

What is the recommended way of doing this? Cloning everything at each level without calling the base class seems wrong? There must be some neat solutions to this problem, what are they?

Can I thank everyone for their answers. It was really interesting to see some of the approaches. I think it would be good if someone gave an example of a reflection answer for completeness. +1 awaiting!

A: 
  1. You could use reflection to loop all variables and copy them.(Slow) if its to slow for you software you could use DynamicMethod and generate il.
  2. serialize the object and deserialize it again.
Petoj
I thought of 1. but ruled it out for the reasons you state...
Tony Lambert
A: 

I don't think you are implementing ICloneable correctly here; It requires a Clone() method with no parameters. What I would recommend is something like:

public class ParentObj : ICloneable
{
    public virtual Object Clone()
    {
        var obj = new ParentObj();

        CopyObject(this, obj);
    }

    protected virtual CopyObject(ParentObj source, ParentObj dest)
    {
        dest.myA = source.myA;
    }
}

public class ChildObj : ParentObj
{
    public override Object Clone()
    {
        var obj = new ChildObj();
        CopyObject(this, obj);
    }

    public override CopyObject(ChildObj source, ParentObj dest)
    {
        base.CopyObject(source, dest)
        dest.myB = source.myB;
    }
}

Note that CopyObject() is basically Object.MemberwiseClone(), presumeably you would be doing more than just copying values, you would also be cloning any members that are classes.

Chris Shaffer
+6  A: 

try the serialization trick:

public object Clone(object toClone)
{
    BinaryFormatter bf = new BinaryFormatter();
    MemoryStream ms= new MemoryStream();
    bf.Serialize(ms, toClone);
    ms.Flush();
    ms.Position = 0;
    return bf.Deserialize(ms);
}
Fernando
+1, the only drawback is that your object needs to be marked with the SerializableAttribute.
Darin Dimitrov
+1 I really like this... but a may not be as fast as I need.
Tony Lambert
A bigger drawback is that it requires _all_ objects in the graph to be serializable, not just your class. This may not always be feasible, and some existing types that are cloneable aren't serializable (e.g. `XPathNavigator` implements `ICloneable`, but it's not serializable).
Pavel Minaev
+1  A: 

The best way is by serializing your object, then returning the deserialized copy. It will pick up everything about your object, except those marked as non-serializable, and makes inheriting serialization easy.

[Serializable]
public class ParentObj: ICloneable
{
    private int myA;
    [NonSerialized]
    private object somethingInternal;

    public virtual object Clone()
    {
        MemoryStream ms = new MemoryStream();
        BinaryFormatter formatter = new BinaryFormatter();
        formatter.Serialize(ms, this);
        object clone = formatter.Deserialize(ms);
        return clone;
    }
}

[Serializable]
public class ChildObj: ParentObj
{
    private int myB;

    // No need to override clone, as it will still serialize the current object, including the new myB field
}

It is not the most performant thing, but neither is the alternative: relection. The benefit of this option is that it seamlessly inherits.

jrista
+5  A: 

The typical approach is to use "copy constructor" pattern a la C++:

 class Base : ICloneable
 { 
     int x;

     protected Base(Base other)
     {
         x = other.x;
     }

     public virtual object Clone()
     {
         return new Base(this);
     }
 }

 class Derived : Base
 { 
     int y;

     protected Derived(Derived other)
          : Base(other)
     {
         y = other.y;
     }

     public override object Clone()
     {
         return new Derived(this);
     }
 }

The other approach is to use Object.MemberwiseClone in the implementation of Clone - this will ensure that result is always of the correct type, and will allow overrides to extend:

 class Base : ICloneable
 { 
     List<int> xs;

     public virtual object Clone()
     {
         Base result = this.MemberwiseClone();

         // xs points to same List object here, but we want
         // a new List object with copy of data
         result.xs = new List<int>(xs);

         return result;
     }
 }

 class Derived : Base
 { 
     List<int> ys;

     public override object Clone()
     {
         // Cast is legal, because MemberwiseClone() will use the
         // actual type of the object to instantiate the copy.
         Derived result = (Derived)base.Clone();

         // ys points to same List object here, but we want
         // a new List object with copy of data
         result.ys = new List<int>(ys);

         return result;
     }
 }

Both approaches require that all classes in the hierarchy follow the pattern. Which one to use is a matter of preference.

If you just have any random class implementing ICloneable with no guarantees on implementation (aside from following the documented semantics of ICloneable), there's no way to extend it.

Pavel Minaev
Important note: Since the MemberwiseClone method results in a shallow copy, you will have to create new instances of all reference type members. You'll also want to null out any delegates/events.
Matt Brunell
Not _all_ reference type members - only those for which you actually need it. For example, `String` is a reference type, but I don't think you would want to create new instances of that even in a deep clone. Ditto for any other "value class" such as `Uri` or `XName`.
Pavel Minaev
Copy-constructor is not part any C# best practices. One should use `ICloneable` and `MemberwiseClone` instead.
taoufik
@taoufik, please provide a reference for that claim (that implementation of Clone via MemberwiseClone is preferred over copy constructor - I'm not aware on any commonly recognized "best practices" on this particular point). I would also appreciate if people who downvoted would explain the downvotes.
Pavel Minaev
+1 Copy constructor provides a pretty clean implementation for ICloneable.
Chris Shaffer
+1 Agreed copy constructor is nice and clean.
csharptest.net
A: 

You should use the MemberwiseClone method instead:

public class ParentObj : ICloneable
{
    protected int myA;
    public virtual Object Clone()
    {
        ParentObj newObj = this.MemberwiseClone() as ParentObj;
        newObj.myA = this.MyA; // not required, as value type (int) is automatically already duplicated.
        return newObj;
    }
}

public class ChildObj : ParentObj
{
    protected int myB;
    public override Object Clone()
        {
             ChildObj newObj = base.Clone() as ChildObj;
             newObj.myB = this.MyB; // not required, as value type (int) is automatically already duplicated

             return newObj;
        }
}
taoufik
is the "newObj.myA = theObj.myA;" doesn't memberwiseclone in this case copy it as it isn't an object?
Tony Lambert
+1 by the way....
Tony Lambert
It will work fine. Basically, any ChildObj object is both a ParentObj and a ChildObj. If you'd have a reference to a ChildObj, and would call MemberwiseClone (independent of where you call it from) on it, you'd get an object which is both ParentObj and a ChildObj as well. Therefore, "newObj.myA = this.myA;" will be completely valid.
taoufik
1) What's `this.base`? 2) Why use `as` when you know the cast will always succeed? 3) Why copy field values when `MemberwiseClone` already does just that?
Pavel Minaev
4) What is `theObj`? Do you mean `this`?
Pavel Minaev
@Pavel: the idea is the answer the question, and not to rewrite his code or to answer any unasked question. this.base => base; whether you use `as` or `(ChildObj)` is a preference a developer has. I think `(ChildObj)` looks horrible. and yes, you don't need to copy any value-type fields. But once again, this is not part of his question.
taoufik
-1 Title specifically says he wants a deep copy, MemberwiseClone is specifically a shallow copy.
Chris Shaffer
@Chris, the idea is that you use `MemberwiseCopy` to do field-by-field copy and also to ensure that cloned object has the same correct type, and then manually do deep copy for all fields that necessitate it inside `Clone`. This answer (nor my answer, which also suggests the same technique) does not in any way imply that `MemberwiseClone` alone is the answer to his problem, but only that it is part of the solution.
Pavel Minaev
+2  A: 

WARNING:

This code should be used with a great deal of caution. Use at your own risk. This example is provided as-is and without a warranty of any kind.


There is one other way to perform a deep clone on an object graph. It is important to be aware of the following when considering using this sample:

Cons:

  1. Any references to external classes will also be cloned unless those references are provided to the Clone(object, ...) method.
  2. No constructors will be executed on cloned objects they are reproduced EXACTLY as they are.
  3. No ISerializable or serialization constructors will be executed.
  4. There is no way to alter the behavior of this method on a specific type.
  5. It WILL clone everything, Stream, AppDomain, Form, whatever, and those will likely break your application in horrific ways.
  6. It could break whereas using the serialization method is much more likely to continue working.
  7. The implementation below uses recursion and can easily cause a stack overflow if your object graph is too deep.

So why would you want to use it?

Pros:

  1. It does a complete deep-copy of all instance data with no coding required in the object.
  2. It preserves all object graph references (even circular) in the reconstituted object.
  3. It's executes more than 20 times fatser than the binary formatter with less memory consumption.
  4. It requires nothing, no attributes, implemented interfaces, public properties, nothing.

Code Usage:

You just call it with an object:

Class1 copy = Clone(myClass1);

Or let's say you have a child object and you are subscribed to it's events... Now you want to clone that child object. By providing a list of objects to not clone, you can preserve some potion of the object graph:

Class1 copy = Clone(myClass1, this);

Implementation:

Now let's get the easy stuff out of the way first... Here is the entry point:

public static T Clone<T>(T input, params object[] stableReferences)
{
 Dictionary<object, object> graph = new Dictionary<object, object>(new ReferenceComparer());
 foreach (object o in stableReferences)
  graph.Add(o, o);
 return InternalClone(input, graph);
}

Now that is simple enough, it just builds a dictionary map for the objects during the clone and populates it with any object that should not be cloned. You will note the comparer provided to the dictionary is a ReferenceComparer, let's take a look at what it does:

class ReferenceComparer : IEqualityComparer<object>
{
 bool IEqualityComparer<object>.Equals(object x, object y)
 { return Object.ReferenceEquals(x, y); }
 int IEqualityComparer<object>.GetHashCode(object obj)
 { return RuntimeHelpers.GetHashCode(obj); }
}

That was easy enough, just a comparer that forces the use of the System.Object's get hash and reference equality... now comes the hard work:

private static T InternalClone<T>(T input, Dictionary<object, object> graph)
{
 if (input == null || input is string || input.GetType().IsPrimitive)
  return input;

 Type inputType = input.GetType();

 object exists;
 if (graph.TryGetValue(input, out exists))
  return (T)exists;

 if (input is Array)
 {
  Array arItems = (Array)((Array)(object)input).Clone();
  graph.Add(input, arItems);

  for (long ix = 0; ix < arItems.LongLength; ix++)
   arItems.SetValue(InternalClone(arItems.GetValue(ix), graph), ix);
  return (T)(object)arItems;
 }
 else if (input is Delegate)
 {
  Delegate original = (Delegate)(object)input;
  Delegate result = null;
  foreach (Delegate fn in original.GetInvocationList())
  {
   Delegate fnNew;
   if (graph.TryGetValue(fn, out exists))
    fnNew = (Delegate)exists;
   else
   {
    fnNew = Delegate.CreateDelegate(input.GetType(), InternalClone(original.Target, graph), original.Method, true);
    graph.Add(fn, fnNew);
   }
   result = Delegate.Combine(result, fnNew);
  }
  graph.Add(input, result);
  return (T)(object)result;
 }
 else
 {
  Object output = FormatterServices.GetUninitializedObject(inputType);
  if (!inputType.IsValueType)
   graph.Add(input, output);
  MemberInfo[] fields = inputType.GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
  object[] values = FormatterServices.GetObjectData(input, fields);

  for (int i = 0; i < values.Length; i++)
   values[i] = InternalClone(values[i], graph);

  FormatterServices.PopulateObjectMembers(output, fields, values);
  return (T)output;
 }
}

You will notice right-off the special case for array and delegate copy. Each have their own reasons, first Array does not have 'members' that can be cloned, so you have to handle this and depend on the shallow Clone() member and then clone each element. As for the delegate it may work without the special-case; however, this will be far safer since it's not duplicating things like RuntimeMethodHandle and the like. If you intend to include other things in your hierarchy from the core runtime (like System.Type) I suggest you handle them explicitly in similar fashion.

The last case, and most common, is simply to use roughly the same routines that are used by the BinaryFormatter. These allow us to pop all the instance fields (public or private) out of the original object, clone them, and stick them into an empty object. The nice thing here is that the GetUninitializedObject returns a new instance that has not had the ctor run on it which could cause issues and slow the performance.

Whether the above works or not will highly depend upon your specific object graph and the data therein. If you control the objects in the graph and know that they are not referencing silly things like a Thread then the above code should work very well.

Testing:

Here is what I wrote to originally test this:

class Test
{
 public Test(string name, params Test[] children)
 {
  Print = (Action<StringBuilder>)Delegate.Combine(
   new Action<StringBuilder>(delegate(StringBuilder sb) { sb.AppendLine(this.Name); }),
   new Action<StringBuilder>(delegate(StringBuilder sb) { sb.AppendLine(this.Name); })
  );
  Name = name;
  Children = children;
 }
 public string Name;
 public Test[] Children;
 public Action<StringBuilder> Print;
}

static void Main(string[] args)
{
 Dictionary<string, Test> data2, data = new Dictionary<string, Test>(StringComparer.OrdinalIgnoreCase);

 Test a, b, c;
 data.Add("a", a = new Test("a", new Test("a.a")));
 a.Children[0].Children = new Test[] { a };
 data.Add("b", b = new Test("b", a));
 data.Add("c", c = new Test("c"));

 data2 = Clone(data);
 Assert.IsFalse(Object.ReferenceEquals(data, data2));
 //basic contents test & comparer
 Assert.IsTrue(data2.ContainsKey("a"));
 Assert.IsTrue(data2.ContainsKey("A"));
 Assert.IsTrue(data2.ContainsKey("B"));
 //nodes are different between data and data2
 Assert.IsFalse(Object.ReferenceEquals(data["a"], data2["a"]));
 Assert.IsFalse(Object.ReferenceEquals(data["a"].Children[0], data2["a"].Children[0]));
 Assert.IsFalse(Object.ReferenceEquals(data["B"], data2["B"]));
 Assert.IsFalse(Object.ReferenceEquals(data["B"].Children[0], data2["B"].Children[0]));
 Assert.IsFalse(Object.ReferenceEquals(data["B"].Children[0], data2["A"]));
 //graph intra-references still in tact?
 Assert.IsTrue(Object.ReferenceEquals(data["B"].Children[0], data["A"]));
 Assert.IsTrue(Object.ReferenceEquals(data2["B"].Children[0], data2["A"]));
 Assert.IsTrue(Object.ReferenceEquals(data["A"].Children[0].Children[0], data["A"]));
 Assert.IsTrue(Object.ReferenceEquals(data2["A"].Children[0].Children[0], data2["A"]));
 data2["A"].Name = "anew";
 StringBuilder sb = new StringBuilder();
 data2["A"].Print(sb);
 Assert.AreEqual("anew\r\nanew\r\n", sb.ToString());
}

Final Note:

Honestly it was a fun exercise at the time. It is generally a great thing to have deep cloning on a data model. Today's reality is that most data models are generated which obsoletes the usefulness of the hackery above with a generated deep clone routine. I highly recommend generating your data model & it's ability to perform deep-clones rather than using the code above.

csharptest.net
Thanks for the extensive answer. +1 for effort alone! Will need to look at this in detail (to do it justice) when I come to implementing my solution. Thx.
Tony Lambert
A: 

Try to use the following [use the keyword "new"]

public class Parent
{
  private int _X;
  public int X{ set{_X=value;} get{return _X;}}
  public Parent copy()
  {
     return new Parent{X=this.X};
  }
}
public class Child:Parent
{
  private int _Y;
  public int Y{ set{_Y=value;} get{return _Y;}}
  public new Child copy()
  {
     return new Child{X=this.X,Y=this.Y};
  }
}
Waleed A.K.