Safely moving a C++ object

views:

332

answers:

+3 Q:

Safely moving a C++ object

I’ve heard some words of warning against shipping an object to another memory location via memcpy, but I don’t know the specific reasons. Unless its contained members do tricky things that depend on memory location, this should be perfectly safe … or not?

EDIT: The contemplated use case is a data structure like a vector, which stores objects (not pointers to objects) in a continuous chunk of memory (i.e. an array). To insert a new object at the n-th position, all objects starting at position n and beyond will need to be moved to make room for the object to be inserted.

+1 A:

Off the top of my head : If you just do a memcpy you end up doing a shallow copy. If you need a deep-copy then this won't work.

What's wrong with the copy constructor and the assignment operators anyway?

Glen 2009-09-11 16:04:51

+6 A:

One primary reason why you should not do this is destructors. When you memcpy a C++ object to another location in memory, you will end up with 2 versions of the object in memory for which only 1 constructor has been run. This will destroy the resource freeing logic of pretty much every single C++ class out there.

JaredPar 2009-09-11 16:05:11

Yes, that makes sense. However, I'm not copying but *moving* the object, invalidating the previous location (i.e. no longer referring to it with any pointer) so the destructor will not be called on the old memory location.

Jen 2009-09-11 16:08:52

@Jen unless you do something truly evil to the previous place in memory for the old object the destructor will run.

JaredPar 2009-09-11 16:09:25

How? The only way a destructor will be called is via a pointer. If no pointers refer to the memory location, it's simply a bunch of bits. Or am I missing something?

Jen 2009-09-11 16:12:59

Objects with automatic storage duration have their destructors called automatically when they go out of scope.

jalf 2009-09-11 16:14:11

*"unless you do something truly evil [] the destructor will run"* If it is an automatic variable or created by another object, sure, but not if it was explicitly (`O *p = new O;`) allocated. The problem then is that Jen is back to doing programmer mediated object management, with all the risks that that entails.

dmckee 2009-09-11 16:20:20

@dmckee, in your example they are still doing something evil because they are not deleting the memory. The only way to delete the memory and not run the destructor is to cast the pointer to void* before deleting. This is evil :)

JaredPar 2009-09-11 16:23:10

Well I think you could just call `operator delete(p);`

GMan 2009-09-11 16:25:28

@GMan, I think that would work as well. Still evil ;)

JaredPar 2009-09-11 16:27:22

Indeed, it's basically the same thing. Just bypasses calling `void*`'s non-existent destructor and makes the cast implicit. But we agree: **evil**.

GMan 2009-09-11 16:30:58

Oh, by all means: **evil**. But evil for multiple reasons not all associated with an unbalanced number of con-/de-structor calls

dmckee 2009-09-11 16:37:27

@Jen, you say "The only way a destructor will be called is via a pointer. If no pointers refer to the memory location, it's simply a bunch of bits". But now you have a memory leak (and if your original object is on the stack you cannot prevent its destructor to get called). There are so many things that can go wrong with memcpy()-ing objects the way you like, it is not worth it for whatever reasons. Don't go against the language

sbk 2009-09-11 17:08:51

Normally, I see the above on alternate implementations of vector. The memory is allocated via malloc(sizeof(Class)*size) and the objects are constructed in place via explicitly called constructors and destructors. Sometimes (like during reallocation) they have to be moved, so the option is to do std::vector's repeated calling of copy constructors on new memory and destructors on the old, or use memcopy and just "free" the old block. Most times the latter just "works", but doesn't for all objects.

Todd Gardner 2009-09-11 17:15:30

@Todd: It will work for PODs. It will invoke undefined behavior for non-PODs.

sbi 2009-09-11 17:56:43

+2 A:

If the object had no pointers within it, and no virtual functions, no children with any of the same, you might get away with it. It is not recommended!!!

This should be done using a copy or deepcopy function or overridden operators.

In the method you would call a new contructor and copy it's contained data items one by one.

for a shallow copy you would copy pointers / references so you would have two object pointing to the same contained elements.... a potential memory leak nightmare.

for a deep copy you would traverse the contained objects and references making new copies of them also.

To move an object you would copy it and delete the original.

Tony Lambert 2009-09-11 16:07:07

+2 A:

It's not allowed by the language specification. It is undefined behavior. That is, ultimately, what's wrong with it. In practice, it tends to mess with virtual function calls, and it means the destructor will be run twice (and more often than the constructors), member objects are shallow copied (so if, for example, if you try this stunt with a std::vector, it blows up, as multiple objects end up pointing to the same internal array.)

The exception is POD types. They don't have (copy) constructors, destructors, virtual functions, base classes or anything else that might cause this to break, so with those, you're allowed to use memcpy to copy them.

jalf 2009-09-11 16:13:23

yup, fixed it. Thanks

jalf 2009-09-11 16:28:38

+2 A:

Short answer: std::memcpy() is for moving memory, not for moving objects. Using it nonetheless will invoke undefined behavior.

Somewhat longer answer: A C++ object that isn't a POD might contain resources that need to be freed and which are kept in handles that cannot be easily copied. (A popular resource is memory, where the handle is a pointer.) It also might contain stuff inserted by the implementation (virtual base class instance pointers) that shouldn't be copied as if it were memory.

The only right way to move an object in C++98 and C++03 is to copy-construct it to its new location and invoke the destructor in the old. (In C++1x there will be move semantic so things might get more interesting in certain cases.)

sbi 2009-09-11 16:23:07

jmucchiello 2009-09-11 17:41:29

@jmucchiello: that's why I qualified my statement with "...in certain cases". (BTW, I don't think your example is right. A temporary needs to be created in both cases.)

sbi 2009-09-11 17:54:15

+3 A:

For the sake of discussion, I assume you mean moving to mean that the original object "dropped" (is no longer used, didn't have it's destructor run) rather than have two copies (which would lead to a lot more problems, reference counts being off, etc). I generally refer to the property of being able to do this being bitwise movable.

In the code bases I work on, the majority of objects are bitwise movable, as they don't store self references. However, some data structures aren't bitwise movable (I believe that gcc's std::set wasn't bitwise movable; other examples would be linked list nodes). In general, I would avoid attempting to use this property as it can lead to some very hard to debug errors, and prefer the object oriented calling copy constructors.

Edited to add:

There seems to be some confusion on how/why someone would do this: here's a comment I made on the how:

Normally, I see the above on alternate implementations of vector. The memory is allocated via malloc(sizeof(Class)*size) and the objects are constructed in place via explicitly called constructors and destructors. Sometimes (like during reallocation) they have to be moved, so the option is to do std::vector's repeated calling of copy constructors on new memory and destructors on the old, or use memcopy and just "free" the old block. Most times the latter just "works", but doesn't for all objects.

As to why, a memcopy (or realloc) approach can be significantly faster.

Yes, it invokes undefined behavior, but it also just tends to work for a majority of objects. Some people consider the speed worth it. If you were really set on using this approach, I would suggest implementing a bitwise_movable type trait to allow types this works for to be whitelisted, and fall back on the traditional copy for objects not in the whitelist, much like the example here.

Todd Gardner 2009-09-11 16:25:46

In general (and in all languages, not just C++), in order to safely move an object, you also need to rewrite ALL pointers/references to that object to point at the new location. That's a problem in C++, because there's no easy way to tell if any object in the system has a 'hidden' pointer to the object you're moving. As you've noted, some classes may contain hidden pointers to themselves. Other classes may have hidden pointers in a factory object that tracks all instances. Its also possible for seemingly unrelated classes to cache pointers to objects for various reasons of their own.

The only way to do it safely is if you have some sort of reflective access to all objects in the system so that you can find all the pointers to the object and rewrite them. This is a potentially very expensive operation in any case, so systems that need it (such as copying garbage collectors) tend to be very carefully organized to do the copying of many objects at once and/or bound the places that need to be searched for pointers with write barriers and such.

Chris Dodd 2009-09-11 17:26:11

ansaurus

tags:

views:

answers:

Safely moving a C++ object

related questions