ansaurus

Question

Using arrays or std::vectors in C++, what's the performance gap?

Answer 1

+1 A:

STL is a heavily optimized library. In fact, it's even suggested to use STL in games where high performance might be needed. Arrays are too error prone to be used in day to day tasks. Today's compilers are also very smart and can really produce excellent code with STL. If you know what you are doing, STL can usually provide the necessary performance. For example by initializing vectors to required size (if you know from start), you can basically achieve the array performance. However, there might be cases where you still need arrays. When interfacing with low level code (i.e. assembly) or old libraries that require arrays, you might not be able to use vectors.

Mehrdad Afshari 2008-12-19 17:37:29

given that vector is contiguous, it is still pretty easy to interface with libraries that require arrays.

Greg Rogers 2008-12-19 17:46:22

Yes, but if you want to mess with vector's internal stuff, there would be less advantage in using a vector. By the way, the keyword was "might not."

Mehrdad Afshari 2008-12-19 17:48:06

Johannes Schaub - litb 2008-12-19 17:59:32

The specific scenario in my mind when I wrote that was stack-based arrays.

Mehrdad Afshari 2008-12-19 18:01:54

Answer 2

+26 A:

Using C++ arrays with new (that is, using dynamical arrays) should be avoided. There is the problem you have to keep track of the size, and you need to delete them manually, and do all sort of housekeeping.

Using arrays on the stack is also discouraged because you don't have range checking, and passing the array around will loose any information about its size (array to pointer conversion). You should use boost::array in that case, which wraps a C++ array in a small class and provides a size function and iterators to iterate over it.

Now the std::vector vs. native C++ arrays (taken shamelessly from here: http://www.xs4all.nl/~weegen/eelis/vector-speed.cpp):

// Comparison of assembly code generated for basic indexing, dereferencing, 
// and increment operations on vectors and arrays/pointers.

// Assembly code was generated by gcc 4.1.0 invoked with  g++ -O3 -S  on a 
// x86_64-suse-linux machine.

#include <vector>

struct S
{
  int padding;

  std::vector<int> v;
  int * p;
  std::vector<int>::iterator i;
};

int pointer_index (S & s) { return s.p[3]; }
  // movq    32(%rdi), %rax
  // movl    12(%rax), %eax
  // ret

int vector_index (S & s) { return s.v[3]; }
  // movq    8(%rdi), %rax
  // movl    12(%rax), %eax
  // ret

// Conclusion: Indexing a vector is the same damn thing as indexing a pointer.

int pointer_deref (S & s) { return *s.p; }
  // movq    32(%rdi), %rax
  // movl    (%rax), %eax
  // ret

int iterator_deref (S & s) { return *s.i; }
  // movq    40(%rdi), %rax
  // movl    (%rax), %eax
  // ret

// Conclusion: Dereferencing a vector iterator is the same damn thing 
// as dereferencing a pointer.

void pointer_increment (S & s) { ++s.p; }
  // addq    $4, 32(%rdi)
  // ret

void iterator_increment (S & s) { ++s.i; }
  // addq    $4, 40(%rdi)
  // ret

// Conclusion: Incrementing a vector iterator is the same damn thing as 
// incrementing a pointer.

Johannes Schaub - litb 2008-12-19 17:41:03

Mehrdad Afshari 2008-12-19 17:49:06

Note that std::tr1::array (or boost::array) can resolve cases where you would have used native array with new.

Klaim 2008-12-19 19:17:47

This is not true for the Visual C++ compiler. But for GCC it is.

toto 2009-11-10 16:45:49

The point in my answer is that vector doesn't *have* to be slower than correponding pointer operations. Of course, it *can* be (easy to achieve by enabling enable debug mode, too) :)

Johannes Schaub - litb 2009-11-10 16:51:52

Answer 3

A:

If you do not need to dynamically adjust the size, you have the memory overhead of saving the capacity (one pointer/size_t). That's it.

Greg Rogers 2008-12-19 17:41:36

Answer 4

A:

Go with STL. There's no performance penalty. The algorithms are very efficient and they do a good job of handling the kinds of details that most of us would not think about.

John D. Cook 2008-12-19 17:41:51

Answer 5

+2 A:

The performance difference between the two is very much implementation dependent - if you compare a badly implemented std::vector to an optimal array implementation, the array would win, but turn it around and the vector would win...

As long as you compare apples with apples (either both the array and the vector have a fixed number of elements, or both get resized dynamically) I would think that the performance difference is negligible as long as you follow got STL coding practise. Don't forget that using standard C++ containers also allows you to make use of the pre-rolled algorithms that are part of the standard C++ library and most of them are likely to be better performing than the average implementation of the same algorithm you build yourself.

That said, IMHO the vector wins in a debug scenario with a debug STL as most STL implementations with a proper debug mode can at least highlight/cathc the typical mistakes made by people when working with standard containers.

Oh, and don't forget that the array and the vector share the same memory layout so you can use vectors to pass data to legacy C or C++ code that expects basic arrays. Keep in mind that most bets are off in that scenario, though, and you're dealing with raw memory again.

Timo Geusch 2008-12-19 17:45:09

I think that to meet the performance requirements ( O(1) lookups and insertions ), you almost **have** to implement std::vector<> using dynamic arrays. Certainly this is the obvious way to do it.

dmckee 2008-12-19 20:00:47

Not just the performance requirements, but also the requirement that storage in contiguous. A bad vector implementation will put too many layers of indirection between the array and the API. A good vector implementation will allow for inlined code, SIMD used on loops, etc.

Max Lybbert 2008-12-19 20:40:56

Answer 6

+3 A:

To respond to something Mehrdad said:

However, there might be cases where you still need arrays. When interfacing with low level code (i.e. assembly) or old libraries that require arrays, you might not be able to use vectors.

Not true at all. Vectors degrade nicely into arrays/pointers if you use:

vector<double> vector;
vector.push_back(42);

double *array = &(*vector.begin());

// pass the array to whatever low-level code you have

This works for all major STL implementations. In the next standard, it will be required to work (even though it does just fine today).

Frank Krueger 2008-12-19 17:47:43

Johannes Schaub - litb 2008-12-19 17:54:23

The current standard says no such thing. It is implied, and it is implemented as continuous storage. But the standard merely says that it is a random access container (using iterators). The next standard will be explicit.

Frank Krueger 2008-12-19 18:05:27

Frank Krueger 2008-12-19 18:07:44

Original 1998 text of the Standard indeed did not require it, but there was an addendum in 2003 that addresses this, so it is really covered by the Standard. http://herbsutter.wordpress.com/2008/04/07/cringe-not-vectors-are-guaranteed-to-be-contiguous/

Nemanja Trifunovic 2008-12-19 18:13:42

+1 for Nemanja's comment. Also, it's a mostly academic debate for some time now; all modern stl libs have 'done the right thing' for some time.

MattyT 2008-12-21 14:22:08

Answer 7

A:

There might be some edge case where you have a vector access inside an inline function inside an inline function, where you've gone beyond what the compiler will inline and it will force a function call. That would be so rare as to not be worth worrying about - in general I would agree with litb.

I'm surprised nobody has mentioned this yet - don't worry about performance until it has been proven to be a problem, then benchmark.

Mark Ransom 2008-12-19 18:04:48

Answer 8

A:

Sometimes arrays are indeed better than vectors. If you are always manipulating a fixed length set of objects, arrays are better. Consider the following code snippets:

int main() {
int v[3];
v[0]=1; v[1]=2;v[2]=3;
int sum;
int starttime=time(NULL);
cout << starttime << endl;
for (int i=0;i<50000;i++)
for (int j=0;j<10000;j++) {
X x(v);
sum+=x.first();
}
int endtime=time(NULL);
cout << endtime << endl;
cout << endtime - starttime << endl;

}

where the vector version of X is

class X { vector vec; public: X(const vector& v) {vec = v;} int first() { return vec[0];} };

and the array version of X is:

class X {
int f[3];

public:
X(int a[]) {f[0]=a[0]; f[1]=a[1];f[2]=a[2];}
int first() { return f[0];}
};

The array version will of main() will be faster because we are avoiding the overhead of "new" everytime in the inner loop.

(This code was posted to comp.lang.c++ by me).

duli 2008-12-19 18:07:53

Answer 9

A:

I'd argue that the primary concern isn't performance, but safety. You can make a lot of mistakes with arrays (consider resizing, for example), where a vector would save you a lot of pain.

Gabriel Isenberg 2008-12-19 18:08:16

Answer 10

+11 A:

Vectors are arrays under the hood. The performance is the same.

One place where you can run into a performance issue, is not sizing the vector correctly to begin with.

As a vector fills, it will resize itself, and that can imply, a new array allocation, followed by n copy constructors, followed by about n destructor calls, followed by an array delete.

If your construct/destruct is expensive, you are much better off making the vector the correct size to begin with.

There is a simple way to demonstrate this. Create a simple class that shows when it is constructed/destroyed/copied/assigned. Create a vector of these things, and start pushing them on the back end of the vector. When the vector fills, there will be a cascade of activity as the vector resizes. Then try it again with the vector sized to the expected number of elements. You will see the difference.

EvilTeach 2008-12-19 18:17:54

Pendantry: the performance has the same big O. std::vector does a little bit of bookkeeping, which presumably cost a small amount of time. OTOH, you end up doing much of the same bookkeeping when rolling your own dynamic arrays.

dmckee 2008-12-19 20:04:46

yes i understand. The thrust of his question though, was what are the performance differences..... I attempted to address that.

EvilTeach 2008-12-19 23:42:34

Answer 11

A:

If you compile the software in debug mode, many compilers will not inline the accessor functions of the vector. This will make the stl vector implementation much slower in circumstances where performance is an issue. It will also make the code easier to debug since you can see in the debugger how much memory was allocated.

In optimized mode, I would expect the stl vector to approach the efficiency of an array. This is since many of the vector methods are now inlined.

2008-12-19 20:04:35

Answer 12

+1 A:

Preamble for micro-optimizer people

Remember "Premature optimization is the root of all evil".

Don't use a C array instead of a vector (or whatever) just because you believe it's faster as it is supposed to be lower-level. You would be wrong.

Use by default vector (or the safe container adapted to your need), and then if your profiler says it is a problem, see if you can optimize it, either by using a better algorithm, or changing container.

This said, we can go back to the original question.

Static/Dynamic Array?

The C++ array classes are better behaved than the low-level C array because they know a lot about themselves, and can answer questions C arrays can't. They are able to clean after themselves. And more importantly, they are usually written using templates and/or inlining, which means that what appears to a lot of code in debug resolves to little or no code produced in release build, meaning no difference with their built-in less safe competition.

All in all, it falls on two categories:

Dynamic arrays

Using a pointer to a malloc-ed/new-ed array will be at best as fast as the std::vector version, and a lot less safe (see litb's post).

So use a std::vector.

Static arrays

Using a static array will be at best:

as fast as the boost::array version
and a lot less safe.

So use a boost::array.

paercebal 2008-12-20 17:35:54

Thanks for addressing static arrays as well - std::vector is useless if you're not allowed to dynamically allocate memory for performance reasons.

Tom 2008-12-21 02:00:08

When you say "Using a static array will be at best as fast as the boost::array version" it shows how biased you are. It should be the other around, Boost:array can be at best fast like static arrays.

toto 2009-11-10 17:08:36

paercebal 2009-12-05 23:23:19

Answer 13

A:

Vectors use a tiny bit more memory than arrays since they contain the size of the array. They also increase the hard disk size of programs and probably the memory footprint of programs. These increases are tiny, but may matter if you're working with an embedded system. Though most places where these differences matter are places where you would use C rather than C++.

Brian 2009-05-11 21:37:02

If this matters, then you're obviously not using dynamically-sized arrays, and as such, your arrays don't need to change size. (If they did, you'd be storing the size somehow). Therefore, you might as well use boost::array unless I'm mistaken - and what makes you say that that needs to "store the size" somewhere?

Arafangion 2010-10-11 23:17:49

ansaurus

tags:

views:

answers:

Using arrays or std::vectors in C++, what's the performance gap?

Preamble for micro-optimizer people

Static/Dynamic Array?

Dynamic arrays

Static arrays

related questions