views:

774

answers:

11

Hey, I'm trying to optimize my C++ code. I've searched the internet on using dynamically allocated C++ arrays vs using std::vector and have generally seen a recommendation in favor of std::vector and that the difference in performance between the two is negligible. For instance here - http://stackoverflow.com/questions/381621/using-arrays-or-stdvectors-in-c-whats-the-performance-gap.

However, I wrote some code to test the performance of iterating through an array/vector and assigning values to the elements and I generally found that using dynamically allocated arrays was nearly 3 times faster than using vectors (I did specify a size for the vectors beforehand). I used g++-4.3.2.

However I feel that my test may have ignored issues I don't know about so I would appreciate any advice on this issue.

Thanks

Code used -

#include <time.h>
#include <iostream>
#include <vector>

using namespace std;

int main() {
  clock_t start,end;
  std::vector<int> vec(9999999);
  std::vector<int>::iterator vecIt = vec.begin();
  std::vector<int>::iterator vecEnd = vec.end();

  start = clock();
  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = i;
  }
  end = clock();
  cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;

  int* arr = new int[9999999];
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i;
  }
  end = clock();
  cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
}
+2  A: 

I imagine the reason why you found iterating and adding to std::vector 3 times slower than a plain array is a combination of the cost of iterating the vector and doing the assigment.

Edit:

That was my initial assumption before the testcase; however running the testcase (compiled with -O3) shows the converse - std::vector is actually 3 times faster, which surprised me.

I can't see how std::vector could be faster (certainly not 3 times faster) than a vanilla array copy - I think there's some optimisation being applied to the std::vector compiled code which isn't happening for the array version.

Original benchmark results:

$ ./array
array:  0.059375
vector: 0.021209
  • std::vector is 3x faster. Same benchmark again, except add an additional outer loop to run the test iterater loop 1000 times:

    $ ./array array: 21.7129 vector: 21.6413

  • std::vector is now ~ the same speed as array.

Edit 2

Found it! So the problem with your test case is that in the vector case the memory holding the data appears to be already in the CPU cache - either by the way it is initialised, or due to the call to vec.end(). If I 'warm' up the CPU cache before each timing test, I get the same numbers for array and vector:

#include <time.h>
#include <iostream>
#include <vector>

int main() {
  clock_t start,end;
  std::vector<int> vec(9999999);
  std::vector<int>::iterator vecIt = vec.begin();
  std::vector<int>::iterator vecEnd = vec.end();

  // get vec into CPU cache.
  for (int i = 0; vecIt != vecEnd; i++) { *(vecIt++) = i; }
  vecIt = vec.begin();
  start = clock();
  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = i;
  }
  end = clock();
  std::cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<std::endl;

  int* arr = new int[9999999];

  // get arr into CPU cache.
  for (int i = 0; i < 9999999; i++) { arr[i] = i; }
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i;
  }
  end = clock();
  std::cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<std::endl;
}

This gives me the following result:

$ ./array
vector: 0.020875
array: 0.020695
Dave Rigby
it is also not safe. It doesn't do bounds checking by default.
jalf
btw, why all the downvotes?
Dave Rigby
Maybe it's caching the array addresses in the CPU register which is why further runs are much faster? I tried looping 2 times instead of 1000 and while the first run took 0.06s, the second run took 0.25s
Sid
Of course that doesn't explain why vector should run at optimal speed right from the start.
Sid
@jalf: Didn't realise that - shows much recently I used std::vector in anger! I've updated post to remove the mistake.
Dave Rigby
+19  A: 

When benchmarking C++ comtainers, it's important to enable most compiler optimisations. Several of my own answers on SO have fallen foul of this - for example, the function call overhead when something like operator[] is not inlined can be very significant.

anon
A: 

The issue seems to be that you compiled your code with optimizations turned off. On my machine, OS X 10.5.7 with g++ 4.0.1 I actually see that the vector is faster than primitive arrays by a factor of 2.5.

With gcc try to pass -O2 to the compiler and see if there's any improvement.

wilhelmtell
+5  A: 

Just for fun, try iterating over the plain array using a pointer instead of an integer index (the code should look just like the vector iteration, since the point of STL iterators is to appear like pointer arithmetic for most operations). I bet the speed will be exactly equal in that case. Which of course means you should pick the vector, since it will save you a world of headaches from managing arrays by hand.

rmeador
A: 

The reason that your array iterating is faster is that the the number of iteration is constant, and compiler is able to unroll the loop. Try to use rand to generate a number, and multiple it to be a big number you wanted so that compiler wont be able to figure it out at compile time. Then try it again, you will see similar runtime results.

leiz
+3  A: 

The thing about the standard library classes such as std::vector is that yes, naively, it is a lot more code than a raw array. But all of it can be trivially inlined by the compiler, which means that if optimizations are enabled, it becomes essentially the same code as if you'd used a raw array. The speed difference then is not negligible but non-existent. All the overhead is removed at compile-time.

But that requires compiler optimizations to be enabled.

jalf
+1  A: 

I agree with rmeador,

  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = i; // <-- quick offset calculation
  }
  end = clock();
  cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;

  int* arr = new int[9999999];
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i; // <-- not fair play :) - offset = arr + i*size(int)
  }
Nick D
+1  A: 

I think the answer here is obvious: it doesn't matter. Like jalf said the code will end up being about the same, but even if it wasn't, look at the numbers. The code you posted creates a huge array of 10 MILLION items, yet iterating over the entire array takes only a few hundredths of a second.

Even if your application really is working with that much data, whatever it is you're actually doing with that data is likely to take much more time than iterating over your array. Just use whichever data structure you prefer, and focus your time on the rest of your code.

To prove my point, here's the code with one change: the assignment of i to the array item is replaced with an assignment of sqrt(i). On my machine using -O2, the execution time triples from .02 to .06 seconds.

#include <time.h>
#include <iostream>
#include <vector>
#include <math.h>

using namespace std;

int main() {
  clock_t start,end;
  std::vector<int> vec(9999999);
  std::vector<int>::iterator vecIt = vec.begin();
  std::vector<int>::iterator vecEnd = vec.end();

  start = clock();
  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = sqrt(i);
  }
  end = clock();
  cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;

  int* arr = new int[9999999];
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i;
  }
  end = clock();
  cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
}
spectre256
A: 

I agree that it is depends on compiler. for instance some compilers will optimize:

  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = i;
  }

but for some next line will be faster:

  for (int i = 0; vecIt != vecEnd; ++vecIt) {
    *vecIt = i++;
  }
db_
A: 

One reason you're code might not be performing quite the same is because on your std::vector version, you are incrimenting two values, the integer i and the std::vector::iterator vecIt. To really be equivalent, you could refactor to

start = clock();
for (int i = 0; i < vec.size(); i++) {
  vec[i] = i;
}
end = clock();
cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
TokenMacGuy
A: 

Your code provides an unfair comparison between the two cases since you're doing far more work in the vector test than the array test.

With the vector, you're incrementing both the iterator (vecIT) and a separate variable (i) for generating the assignment values.

With the array, you're only incrementing the variable i and using it for dual purpose.

csj