views:

202

answers:

3

What's the best way to perform a deep copy in the constructor of an object that is part of multi-threaded C++ code?

A: 

My first impulse (I'm not an expert):

Have a lock on that object that the code uses for writing to it. When you're deep copying, lock the object, perform your deep copy, and then unlock it.

Or, am I missing something here?

Claudiu
Not as easy as it sounds, given that it's a deep copy that's involved here. There are potentially many locks on different pieces of data in the object graph. Very, very easy to either miss one or to deadlock.
Andrew
+4  A: 

Depends on your data structures.

I'd guess the problem you're facing (although you don't say so) is potential locking inversion. If you're deep copying, then you will presumably take one or more locks for the various objects you need to copy.

If you can define a DAG (that is, a partial order) where the nodes are every lock in the system, and every combination of locks you might ever want to take is connected by edges, then you can ensure that locks are never taken in different orders in different threads. Hence in particular you won't get locking inversion. A typical rule is to take the "most general" lock last, since this tends to minimise contention.

But if you are deep copying one of a whole bunch of "WidgetBoxes", each containing "Widgets" which are basically indistinguishable, with possible overlap between the contents of boxes, then you have a problem defining a locking order naturally. Further, you'd have to lock the WidgetBox first (even though it's the "most general" object), since without that lock you can't tell what else you need to lock. If Widgets are comparable, you might be able to lock each in order, do your copy, and release everything. Nasty.

An alternative is to define a single lock shared between all Widgets and WidgetBoxes. This might introduce a lot of contention, in which case optimistic locking may improve things, provided that not too many modifications occur concurrently with copies.

Another alternative may be to relax the guarantees you make about the copy - instead of requiring that the full copy be made from an identifiable state of the deep structure, you could first lock the WidgetBox, shallow copy it (using refcounting or whatever - the lock on a refcount is typically an "ultimate inner lock" and hence not an inversion risk), release the WidgetBox lock, then copy each of the Widgets in turn. Use the same approach to copying the Widgets if they have internal structure. The result might contain a Widget in a state it did not achieve until after it was removed from the WidgetBox in another thread, or other such incongruities, so if that's not acceptable then you can't use this approach. But if you only ever lock one object at a time in each thread, then you can't get locking inversion.

A final, possible "nuclear" option, is to make everything immutable, and always copy on modification. If nothing can ever be modified, then you don't need any locks (although you do still need memory barriers when passing references between threads).

If none of these works, then you're out of my experience unless I've forgotten something. I'd speculate that database implementation is somewhere that a lot of lock-related cleverness goes on, and that would be the area to turn for ideas.

Steve Jessop
A: 

fork()

I'm just kidding. But it was too funny to pass up :)

I think onebyone has covered most of your options. Except, perhaps, stepping back and seeing if you can find a way to avoid doing the deep copy in the first place...

Mike G.