views:

139

answers:

5

Hi

I need to copy an object that has a pretty deep hierarchy of member variables (i.e., the object has several members of various types, each of which has several members of various types, etc.). Doing a deep copy would require implementing a clone() method in many classes, which seems overkill for my application.

The solution I've come up with is fairly simple, but I wonder if it's a bad idea. My solution is as follows:

  • Define an empty interface, Settings.
  • Define an interface called IsCopyable, which has methods getSettings() and applySettings(Settings s).
  • For a given class X that implements IsCopyable, a class that implements the interface Settings is written that holds the 'settings' that must be applied to an object of class X in order to copy it. (I usually nest this class within the class X, so I have X.Settings implements Settings, but this could be done elsewhere.)

Then, to copy an instance of class X:

X myX = new X();
// Stuff happens to myX.
// Now we want to copy myX.
X copyOfX = new X();
copyOfX.applySettings(myX.getSettings());

I.e., to copy a given object, create a new instance of that object, then call getSettings() on the object to be copied, passing the resulting Settings object as a value to applySettings() on the new instance. (Of course, the copying could be wrapped in a member called copy() or something.)

This works very well for my particular problem, but am I doing something stupid? Have I (poorly) reinvented something that already exists?

Thanks in advance.

Chris

+3  A: 

I want to answer 'yes' to your last question, just for fun :-) But I know you would like some arguments, so let me try :

  • if a concept (here copying) is unique to an object, then you can consider merging the two in the same class (that would be implementing the method in the class itself). For example:

    • if for some objects, there are several way to make copies, depending on ..., then the copying really deserves to be out of the original object.
    • if for all objects, there is a unique way of copying it, is there really much value in creating a different object to do the copying ? (It has for example if the total complexity is too much, hard to read and understand, so it's good to break it down).
  • I have always had trouble creating copies by calling an explicit constructor. The reason is that Constructors are the only methods that cannot be inherited (excluding statics...), so they cannot be generic (impossible to have a unique interface for all your copiable objects). This means you can have no generic code, in all your application, that is able to copy your objects. Every time I tried, there comes a time when I really need to make a copy in a general way.

  • Explicitely calling constructors also means I will be impossible, in the future, to substitute a subclass. Say you have a algorithm A that works on a variable B. If you give A a subclass C of B, when A makes a copy its B variable (who's actual type is C), the copy will be created with the B constructor, so it will not be of the same class, and probably will change behavior. So copying by calling constructor is extrelemely limited.

  • Explicitely calling constructors means it is impossible to work with interfaces. You can read in so many places about the value of interfaces... So for example, in our application, many objects are instanciated no directly in our code, but a Locator/Factory is asked for an interface (or class), with many possible advantages (if your application comes to need this one day):

    • If I want to substitute to every A object a B subclass in a specific context, for example to measure performance of a costly operation during some automated testing, it's very easy. We also needed to substitute HashMaps by a subclass, to find one non-seriable object that was inserted in the Map and later caused errors during Serialization.
    • If I have an interface, creating an object only involves the interface in my regular code (Factory excepted). So I have no dependency at all to the concrete class, which is so good as you know (a dependency to an interface has so much less transitive dependencies, and is so much easier to mock for testing).
    • this factory is actually Spring-backed in our case, so the instanciation is done via Spring. Many additional steps are taken as need (proxying, interception, initialization methods ...).


In our application, we usually end up creating one (or a few) cloners. Given a top object, they know how to make a deep copy of it. The advantage with a generic cloner is that the code is written only once, it is generic for the whole application. Often, it is also reused between applications...

Implementation: using reflexion for example, you get every member recursively. There are many traps to avoid however:

  • loops: A references B that references C that references A. So I keep a Map of the objets that have been copied already, referencing the copy. When I would copy an object, but find out it is already in the map, I don't copy it, but substitute its already-made copy.
  • special types: enums should not be copied (also some other static objects). Some library classes could have problems also, so you can keep a Set or Map of special classes that you don't want to copy, or copy in a special way.
  • you can get in trouble with final fields ...

Specific cases

There are often specific objects where the default way is not correct. We want both, a generic implementation, and the possibility to overload it as needed. For them, we use this:

  • if we can modify the objects, we let them implement a specific CustomizedCopier interface, and their code in that method is responsible for doing the copying, as they want. The generic code doesn't do anything if he sees this interface.
  • if we can't modify the objects (JRE, third-party code ...), we have a Map/Registry Map that stores the classes that are specific, along with the specific copier that we want for them. Note that this trick is also used sometimes to customize the copying not in general, but only for some special use-cases, as it can overload the way objects are copied.


In fact, I usually ended up with several cloners. For example, cloning data persistent entities typically use this knowledge to clone a bit differently (for example, ids and audit fields could be made nulls).

I usually also have a class that does the same dependency search, but for other needs:

  • toString() a complex object to create a debug String of it.
  • equals() and hashCode() implementations if needed.
  • reinitialise a graph of objects to its default values for all properties (think of the implementation of a 'reset' button in a multi-tab huge form).
  • check for existence of an object somewhere in an object graph
  • control the Seriability of a graph of objects (typical use-case with the HttpSession that is serialized in some condition ; in development, we check explicity the objects, to detect an not-serializable object, and provide the best error message to the developper).
  • ...

Please note that the copying is needed often for multi-threading. Ideally, objects reused in multi-threaded environments are immutable. If not, cloning is typically advised to ensure program global coherence...


Performance

Using reflexion is not always so fast. Typically, for a copying that is big in volume and used often, we would implement the copying in the objects themselves. But we found out there are only a few classes that need to be copied and are in high volume, so it is just an exception to the general mecanism, that we plug afterwards (I wrote earlier in the post how, using the register) only when they become useful.

KLE
A: 

You can use the ICloneable inteface, here's a explanation http://blog.reamped.net/post/2008/03/Implementing-Interfaces-ICloneable-and-IComparable.aspx

rdkleine
A: 

I'm currently having a similar situation where I have to copy a customer server control for asp.net. The problem there is that cloning by using MemberwiseClone() will give some bad effects due to the complex control hierarchy.

I don't know how to solve that problem yet, but I have some idea of implementing the ICloneable interface but without using MemberwiseClone() and instead creating a new instance like

public class MyObject : ICloneable
{
   ...

   public object ICloneable.Clone()
   {
      MyObject clone = new MyObject();
      //copy everything I want to have from this object to the clone

      return clone;
   }
}

Then you just call

MyObject someObjInstance = new MyObject();
...
...
MyObject clone = ((ICloneable)someObjInstance).Clone();
...

And you would get a brand new object with the copied members you need to have. I don't see the benefit of having the settings you mentioned, since the class implementing Clone() will know what to copy.

It depends on your specific purpose of course. This will mean creating new instances rather than doing a bit-wise copy as the MemberwiseClone() would do it.

Juri
A: 

If your interface Settings is empty (no properties or methods), I would call that a code smell. By convention, interfaces should start with the letter 'I' or ISettings.

How is it that you can implement the method applySettings(Settings s)? Since as you stated, the Settings interface is empty, you can't do anything with the s paramter unless you cast it to another object. This is bad, or better described as 'pointless' because the parameter is strongly typed, but that strong typing is ignored.

Also, I don't see a good reason to have an applySettings() method. If you have one object, and a second object that has the values you need, just use the second object. If the second object is in use (in some other object graph), then make a copy of it and use that. There should be no reason to write the logic that says 'make this object a copy of another object' since you must already write 'make a copy'.

Making a deep copy should be easy from a client perspective. The ICloneable interface is perfect for this. As a design pattern, this is known as the Prototype pattern. If you need different ways to create a copy (i.e. different kinds of deep copy operations) you may consider a Builder pattern to implement the Clone() method

Matt Brunell
+1  A: 

It's maybe a code smell, and you may need to take another approach. Your object should be "obviously cloneable", there are several ways to create this type of object:

  • Your object is a value object (struct in C#) and reference only others value objects, or immutable objects. (in this case, cloning is as easy as affecting a variable)
  • Your object is immutable, so you don't need to clone it, and can just share the reference.
  • Your object is a "bag of properties" which is easy to clone because each members are also obviously cloneable. (Bindings in WCF are like this)
  • Your object is just a serializable data, so you can serialize it and deserialize to create a new instance.

I don't know enough about your design to tell you if it's the best solution, but generally what I do is to create a factory, which is just a "bag of properties" to create my object, then I implement ICloneable on the factory.

Nicolas Dorier
The serialize/deserialize idea is quite elegant; I hadn't thought of that.
Microserf