views:

584

answers:

6

When a C++ function accepts an std::vector argument, the usual pattern is to pass it by const reference, such as:

int sum2(const std::vector<int> &v)
{
   int s = 0;
   for(size_t i = 0; i < v.size(); i++) s += fn(v[i]);
   return s;
}

I believe that this code results in double dereferencing when the vector elements are accessed, because the CPU should first dereference v to read the pointer to the first element, which pointer needs to be dereferenced again to read the first element. I would expect that it would be more efficient to pass a shallow copy of the vector object on the stack. Such shallow copy would encapsulate a pointer to the first element, and the size, with the pointer referencing the same memory area as the original vector does.

int sum2(vector_ref<int> v)
{
   int s = 0;
   for(size_t i = 0; i < v.size(); i++) s += fn(v[i]);
   return s;
}

Similar performance, but much less convenience could be achieved by passing a random access iterator pair. My question is: what is flawed with this idea? I expect that there should be some good reason that smart people accept to pay the performace cost of vector reference, or deal with the inconvenience of iterators.

Edit: Based on the coments below, please consider the situation if I simply rename the suggested vector_ref class to slice or range. The intention is to use random-access iterator pairs with more natural syntax.

+1  A: 

Pass by value unless you're certain passing by reference improves performances.

When you pass by value, copy elision may occur which will result in similar if not better performances.

Dave wrote about it here:

http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/

Edouard A.
Sebastian
If it doesn't alter the vector, passing by value is perfectly fine too. +1 from me.
jalf
Interesting. I'll have to read that more closely. Despite the title, though, it looks like he's focus on returns, not parameter passing, so I'm not sure it applies to a situation like this.
Adrian McCarthy
Checked the article. As far as I understand copy elision only applies when the compiler can be sure that you don't need the original object anymore. This is not the case here. The caller of sum2 could just as well continue to use the original vector for other calculations, so it can not be "stealed" into this function.
shojtsy
It appears it recommends passing by value if you actually want to make a copy, rather than passing a reference and making a copy from that.
UncleBens
+7  A: 

I believe that this code results in double dereferencing when the vector elements are accessed

Not necessarily. Compilers are pretty smart and should be able to eliminate common subexpressions. They can see that the operator [] doesn't change the 'pointer to the first element', so they have no need make the CPU reload it from memory for every loop iteration.

int3
Excellent point.
Adrian McCarthy
OK, so compilers can theoretically decrease the number of double derefences to 1 for each call of this method. Wouldn't an ensured 0 be better?
shojtsy
Calculate your costs in terms of memory accesses, not the number of dereferences. In your proposed 'shallow copy' you'd be doing the memory access in the caller instead of the callee, so the net result would be (at best) the same. In fact, if the compiler doesn't inline the function, it'd probably be worse, since you're passing it a larger argument.
int3
+5  A: 

What is flawed with this idea?

Simple: It's premature optimization. Alternatives: Accept a vector<int> const& and use iterators or pass iterators directly to the function.

sellibitze
Is it premature optimization when the code is simpler then the naive version? Why use iterators when we can create a more natural syntax with the same performace as iterators?
shojtsy
Because you could have just passed the original iterator, by value or by const reference, and trusted the compiler to optimize your code. In many cases, the compiler is able to eliminate the double indirection. And in the cases where it can't do that, the performance difference almost certainly won't actually *matter*.
jalf
A pair of iterators *is* already the "shallow copy you want". Also, it's premature optimization because you're likely looking in the wrong place to improve performance. Measure first, then take actions.
sellibitze
@shojtsy: You could check whether there's any performance difference between `v[i]` and `int* p = ... p[i]` and report back. :)
sellibitze
+2  A: 

There is no double dereferencing because the compiler will probably pass the real pointer to the vector as the argument and not a pointer to a pointer. You can simply try this out and check the disassembly view of your IDE for what is actually going on behind the scenes:

void Method(std::vector<int> const& vec) {
 int i = vec.back();
}


void SomeOtherMethod() {
  std::vector<int> vec;
  vec.push_back(1);
  Method(vec);
}

What happens here? The vector is allocated on the stack. The first push back is translated to:

push        eax  // this is the constant one that has been stored in eax
lea         ecx,[ebp-24h] // ecx is the pointer to vec on the stack
call        std::vector<int,std::allocator<int> >::push_back

Now we call Method(), passing the vector const&:

lea         ecx,[ebp-24h] 
push        ecx  
call        Method (8274DC0h)

Unsurprisingly, the pointer to the vector is passed as references are nothing but permanently dereferenced pointers. Now inside Method(), the vector is accessed again:

mov         ecx,dword ptr [ebp+8] 
call        std::vector<int,std::allocator<int> >::back (8276100h)

The vector pointer is taken directly from the stack and written to ecx.

Sebastian
To do that, the compiler would have to change the function signature, which seems unlikely. Changing the signature would break callers of the functions. It's possible if you have advanced link-time code generation, but I'm not sure if there's an implementation out there that's that advanced.
Adrian McCarthy
I checked it out. No double dereferencing.
Sebastian
The vector contains a pointer to the actual data, so passing a pointer to the vector is indeed passing a pointer to the pointer. The second dereference happens in the call to back(). If you passed an iterator instead, then (in most implementations you'd be passing a pointer directly to the data contained within the vector, thus eliminating a level of indirection.
Adrian McCarthy
Now I finally understand what you meant by double dereferencing. I didn't think the poster meant that because this kind of double dereferencing always occurs even when accessing the vector on the stack.
Sebastian
Sebastian, in your last code section, the first dereference is the mov, and the second is inside the back method when dereferencing the pointer stored as a field of the vector object, isn't it?
shojtsy
@shojtsy: You're absolutely right. The original poster asked how to pass the vector as a method argument most efficiently, thus I thought he implied that by passing it as a reference, he would introduce an additional dereferencing.
Sebastian
@Sebastian: I am the original poster. I was trying to describe that passing a reference to a vector introduces an extra dereferencing as compared to passing a random iterator pair by value (which is same as a pointer pair). The random iterator pair could then be encapsulated in a class having vector-like syntax without introducing a dereferencing level. I already start from the point that the idea is flawed, otherwise it would be widely used, I am just trying to find where the flaw is.
shojtsy
+1  A: 

You're right that there's an extra indirection here. It's conceivable (though it would be surprising) if the compiler (with the help of link-time code generation) optimized it away.

What you've proposed is sometimes called slicing, and it's used extensively in some situations. Though, in general, I'm not sure it's worth the dangers. You have to be very careful about invaliding your slice (or someone else's).

Note that if you used iterators for the loop instead of indexing, then you'd deref the reference only a couple times (to call begin() and end()) rather than n times (to index into the vector).

int sum(const vector<int> &v)
{
   int s = 0;
   for (auto it = v.begin(); it != v.end(); ++it) {
       s += fn(*it);
   }
   return s;
}

(I'm assuming the optimizer will hoist the begin() and end() calls out of the loop. You could do it explicitly to be certain.)

Passing a pair of iterators instead of the container itself seems like the STL idiom. That would give you more generality, as the type of container can vary, but so can the number of dereferences needed.

Adrian McCarthy
+5  A: 

What's wrong with your idea is that you already have two perfectly good solutions:

  • Pass the vector as is, either by value (where the compiler will often eliminate the copy), or by (const) reference, and trust the compiler to eliminate the double indirection, or
  • Pass an iterator pair.

Of course you can argue that the iterator pair is "less natural syntax", but I disagree. It is perfectly natural to anyone who's used to the STL. It is efficient, and gives you exactly what you need to work with the range, using std algorithms or your own functions.

Iterator pairs are a common C++ idiom, and a C++ programmer reading your code will understand them without a problem, whereas they're going to be surprised at your home-brewed vector wrappers.

If you're really paranoid about performance, pass the pair of iterators. If the syntax really bothers you, pass the vector and trust the compiler.

jalf