views:

238

answers:

6

If a Java class implements the Serializable interface but does not have a public clone() method, it is usually possible to create a deep copy like this:

class CloneHelper {
    @SuppressWarnings("unchecked")
    public static <T extends Serializable> T clone(T obj) {
        try {
            ByteArrayOutputStream baos = new ByteArrayOutputStream();
            ObjectOutputStream oos = new ObjectOutputStream(baos);
            oos.writeObject(obj);
            oos.close();
            byte[] bytes = baos.toByteArray();
            ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
            ObjectInputStream ois = new ObjectInputStream(bais);
            T copy = (T) ois.readObject();
            ois.close();
            return copy;
        } catch (ClassNotFoundException ex) {
            // Shouldn't happen
            throw new Error(ex);
        } catch (IOException ex) {
            // Probably a bug in T's custom serialization methods
            throw new RuntimeException(ex);
        }
    }
}

I often encounter third-party library classes like this and resort to hacks like the one above. I've even extended ObjectOutputStream on occasion to make the copy shallower. It's never caused serious problems, other than being inefficient (slow to encode/decode and the temporary serialization graphs can consume a lot of memory.)

And if it is not safe to use this technique, the class probably should not have been declared Serializable.

So what I would like to know is, if your class is Serializable, what might prevent you from defining a public clone() method (using either the Cloneable interface or a copy constructor?)


Related: Copy an object in Java

+3  A: 

I would prefer to use a copy constructor rather than use the above mechanism. You can more finely define what's to be deeply or shallowly copied, and make the copying of an object distinct from the serialisation of an object. A copy constructor may (for example) allow two objects to share a reference to a master object, whereas this may not be appropriate for an object serialised and transmitted across the network.

Note that the Cloneable method is widely regarded now as broken. See this article with Joshua Bloch for more information. In particular it doesn't have a clone() method!

Brian Agnew
True, but I was mostly referring to third-party classes (where you are not able to add a constructor.)
finnw
+4  A: 

Brian's point about Cloneable is very good, but even if Cloneable worked right, there are still instances where you might want an object to be serializeable but not cloneable.

If an object has a unique identity outside the scope of the process, like an in-memory representation of a database record, you don't want it to be cloneable because that is equivalent to making a new record with identical attributes, including identity-related attributes like the database key, which is almost never the right thing. At the same time, you may have a system broken into multiple processes for stability or other reasons, so you may have one process talking to the database and generating these "ENTITY" objects (See "Domain-Driven Design" by Eric Evans for more info on maintaining object identity coherence in a data-backed application), but a separate process may use these objects to perform business logic operations. The entity object would need to be serializable for it to be passed from one process to another.

David Gladfelter
This is very true. In fact my answer was meant to address the above, but obviously didn't do it clearly enough! +1 plus I've edited my answer.
Brian Agnew
I probably wouldn't make that class serializable either. I might declare an instance method e.g. `readValuesFromStream(ObjectInput is)` but that's not quite the same because you are not creating a new instance.
finnw
+1  A: 

Well, you're saying the serialization mechanism is one way to "clone" objects indirectly. That is of course not its primary function. It's usually used to let programs transmit objects across a network, or store and later read them. You may expect an object to be used this way, and implement Serializable, while not expecting code to clone objects locally, and not implement Cloneable.

The fact that code is working around this via serialization suggests the code is using an object in a way the author didn't intend, which could be either the author or caller's "fault", but it doesn't imply that in general Serializable and Cloneable go together.

Separately, I am not sure clone() is "broken" as much as tricky to implement correctly. A copy constructor is more natural to use and get right IMHO.

Sean Owen
+2  A: 

I think Serializable and Cloneable interfaces should be used for absolutely different purposes. And if you have a complex class then implementing each of them is not so easy. So in general case it depends on the purposes.

Roman
A: 

That strikes me as sort of dangerous, as there are a number of pitfalls to serializing, (although most of them are unlikely, I'd still want to test for them if I was serializing objects in 3d party libraries). Not likely to be a common issue, but it's possible to have an object with a volatile variable as part of it's state that might be part of a cloning operation (not that this is a good design, just that it's possible), and such a field would not get copied in a serialization/deserialization process. Another issue that comes to mind is enums, constants, and the possibility of getting multiple copies of such things if you don't deal with them during deserialization.

Again, edge cases, but something you'd want to watch out for.

Steve B.
A: 

One of the biggest problems with Serializable is that they can't be easily made immutable. Serialization forces you to make a compromise here.

I'd be tempted to create copy constructors down the object graph. Its just a bit more grunt work to do.

Fortyrunner