views:

171

answers:

6

Hello!

What's better from a performance point of view std::map<uint32_t, MyObject> or std::map<uint32_t. MyObject*> if MyObject is 'fat' (that is operator= rather expensive) and I have to insert/update/delete a lot ?

A: 

Inserting a object leads to a call of the assignment operator (=). So the performance depends on the implementation of MyObject's assignment operator. If the implementations has to copy large amounts of data, that may lead to a performance bottleneck. If the implementation used e.g. a kind of copy-on-write scheme it should not.

More often I suppose the object style is not as performance at the pointer style. However, if it leads to an bottleneck depends on the actual application.

The advantage of using the object instead of the pointer is that is makes the memory management much easier.

dmeister
No, inserting an object leads to a call to the copy constructor, not to the assignment operator. This does not make much difference though, since copy constructor is also going to be heavy.
AndreyT
@AndrewT: You are right!
dmeister
A: 

If copying MyObjects is an expensive operation then there will be a performance penalty. However you should probably benchmark this yourself and see if it's even an issue in the context of all the other things your app is doing.

If you have boost you should consider storing a smart pointer (possibly boost::shared_pointer) in your map instead of bare pointers as this is much safer

Glen
+5  A: 

If you'd prefer to store the objects "by value", but don't want to perform expensive copying, then just don't do the copying at all. For example, you can always insert "empty" objects (which can be copied quickly) and then fill them with actual content after they are already inserted into the map. The latter can be done in more efficient way by employing, for example, move semantics instead of copy semantics. Associative containers are not supposed to perform any copying between the already inserted elements (although in theory it is probably possible), so once you have taken care of the new element insertion, you should not run into any additional issues with expensive copying.

For example, a typical "expensive" insertion scenario might look as follows

MyObject new_value(/* constructor arguments */);
// Maybe do some additional preparations on `new_value`
// ...

// And now: the actual insertion
map[key] = new_value;
// .. which makes a call to the heavy assignment operator

Note, that in this scenario it is you who's making the call to the assignment operator. Since you have the control over the actual copying, you can rewrite it in much less expensive fashion, as follows

MyObject& new_value = map[key];
// Now `new_value` is a reference to a default-constructed object

// Here you should "load" the `new_value` object with whatever information
// you want it to carry. That should cover both the original constructor's
// functionality from the previous piece of code, as well as any 
// post-constructor preparations
// ...

Note, that in the second scenario the effort required to build the new value is basically the same as in the first one, but there no extra copying for the actual insertion. Also note, that in this case your object has to be default-consructible, which is not normally a requirement imposed on standard container elements.

If you decide to store the objects "by pointer", a better idea would be to use appropriate smart pointers instead of raw pointers.

AndreyT
@AndreyT: why associative containers are not supposed to perform any copying between the already inserted elements ? Can you please elaborate on this ?
Konstantin
@Konstantin: The common specification of associative containers states that insertion of new element does not invalidate any existing iterators pointing into the container. Also, deletion of elements is not allowed to invalidate any iterators besides the ones being deleted. This usually means that associative containers do not "touch" (reallocate, copy, etc) any elements besides the ones immediately involved in the current operation.
AndreyT
@Konstantin: Of course, the requirement can be satisfied with copying still involved (copy aside, then copy back in place), but normally implementations of associative containers don't do and don't need to do any copying internally (aside from new element insertion).
AndreyT
I don't understand what you are saying. The assignment will be near enough as expensive as the copy. You are basically arguing for 2 phase construction of the object inside the map, but so what. The cost of ClassA c; c= c_orig; should be roughly equivalent to ClassA c(c_orig); What have I missed?
polyglot
@polyglot: Apprently you misundertood something. What I'm proposing does not involve neither copy-assignemt nor copy-cosntruction. What I'm proposing is to default-construct an object in the map (first phase), and then use the second phase to directly "build" the requred object inside the map. I illustrated it by the example above. No copying involved.
AndreyT
+1  A: 

std::map<uint32_t, boost::shared_ptr<MyObject> > is the nice way to deal with this.

If copy is expensive, it will generally be undesirable to store the map values by value.

polyglot
other than boost?
sevity
you can always store the pointers if you want to stay away from boost. You cannot ever use auto_ptr<T> in a map or vector however.
polyglot
+4  A: 

Check out the Boost Pointer Container Library for containers that hold pointers safely.

Mark Ransom
A: 

Inserting elements will actual invoke the copy constructor, not operator=, and you shouldn't automatically assume that the overhead of dynamically allocating and accessing the object through a pointer will be faster than paying the price of an extra construction. Especially with C++0x, rvalue references can make this cost evaporate, and storing containers of pointers as a performance optimization will disappear.

Point being, don't assume that storing a fat object in a container is bad, especially std::map (which should only make one copy). That extra level of indirection can really hurt on CPU bound operations. You should measure; but the "line" where it's better to store a pointer is often much higher than people think.

It also depends on what you mean by expensive. Modern processors are really good at moving memory around. But if by expensive you mean talking over a network or reading from disk, then yeah that extra copy might be painful (..again, until rvalue references come around).

Terry Mahaffey