views:

447

answers:

6

I tried 2 things: (pseudo code below)

int arr[10000];
for (int i = 0; i < 10000; i++)
{
   for (int j = 0; j < 10000; j++)
   {
       arr[j] = j;
   }
}

and

vector<int> arr(10000);
for (int i = 0; i < 10000; i++)
{
   for (int j = 0; j < 10000; j++)
   {
       arr[j] = j;
   }
}

I ran both the programs and timed it using the "time" shell command. Program 1 runs in 5 seconds, program 2 runs in 30 seconds. I ran both programs with compiler optimization turned on, and both programs ran in about the same time (0.38s). I am confused by these results. Can someone please explain to me why this is happening?

Thanks!

+15  A: 

For the template, subscripting is done with operator[]. With optimization turned off, that'll usually be generated as a real function call, adding a lot of overhead to something as simple as subscripting into an array. When you turn on optimization, it's generated inline, removing that overhead.

Jerry Coffin
Hey Jerry, Thank you for your answer. It makes sense now. I guess with arrays, using [] doesn't make the call to operator[] since it a 'natural' (for a lack of a better word) type. Hence the 5/30.
aip.cd.aish
Right -- for a built-in array, the subscripting code is almost certainly *always* generated inline.
Jerry Coffin
And this underscores how function call overhead can really add up.
Crashworks
@aip.cd.aish: The words "intrinsic" and "native" are used more commonly than "natural" to describe features built into a language.
Adisak
@Adisak: Cool, Thanks. Good to know :)
aip.cd.aish
+7  A: 

In debugging mode, implementations of std::vector provide a lot of run-time checking for ease of use. This checking is not available for native arrays. For example, in VC2008, if you compile your vector example in debugging mode, there will be range-checking even in the case of operator[].

AraK
+4  A: 

If your non-optimized vector implementation is performing bounds checking, that would account for the discrepancy.

Dan Blanchard
++ My guess is you're right, but I wanted him (or her) to find out.
Mike Dunlavey
A: 

because when you write vector arr(10000); you create an object, call it's functions... when and it will be slower than whe you create int arr[10000];

Davit Siradeghyan
A: 

In your examples, the array is on the stack. Accessing data in the array involves accessing data on the stack. That's fast.

On the other hand, while the vector is on the stack, the data for a std::vector is allocated somewhere else (by default it's allocated on the heap via std::allocator). Accessing data in the vector involves accessing data on the heap. That's much slower than accessing data on the stack.

You get something for the performance penalty, though. std::vector is growable, while a regular array is not. Also, the size of a std::vector does not have to be compile time constant, while the size of an array on the stack does. A heap-allocated array (via operator new[]) does not have to be a compile-time constant. If you compare a heap allocated array with a std::vector you'll find the performance is much closer.

int* arr = new int[10000];
for (int i = 0; i < 10000; i++)
{
   for (int j = 0; j < 10000; j++)
   {
       arr[j] = j;
   }
}

delete[] arr; // std::vector does this for you
Max Lybbert
+4  A: 

These are good answers, but there's a quick way you can find out for yourself.

You're seeing a 6-to-1 difference in performance, right? Just run the slow one and hit the "pause" button. Then look at the call stack. The probability is 5 out of 6 (83%) that you will see exactly how it is spending those 25 extra seconds. Do it several times to get as much insight as you want.

For the optimized case, do the same with program 1. Since it is 13 times slower than the optimized program, you will see the reason why on each "pause", with probability 12/13 = 92%.

That is an application of this technique.

Mike Dunlavey
I've seen many developer's minds blown when first seeing the good ole' pause technique, especially when they don't buy it and spend another serveral hours settiing-up a full profiling run only to discover "pause" already had it nailed.
DanO
@DanO: Yeah. I'm just trying to get the word out. There's no reason for people not to know it.
Mike Dunlavey
Hey Mike, thanks for the reply. I would like to try it, but I am using a command line environment and gcc to compile my programs (I am new to this environment). Is there someway I can pause it and view the stack trace there? Thanks :)
aip.cd.aish
Mike Dunlavey
... There's a clumsy way that I've sometimes used. Write an empty function foo(){} and sprinkle calls to it throughout the code. Put a breakpoint in it, and disable that. Then, when it's running, see if you can enable the breakpoint at a random time. This method is not as precise, but better than nothing.
Mike Dunlavey
... There's another way that could work in your case. In gdb, just single-step at the instruction level "si". You'll see exactly what it's doing (though you may not like it :).
Mike Dunlavey
@DanO: I'm reminded of an Edison story, where he was interviewing some engineers. He asked them to find the volume of a light bulb globe. Most of them set to work with calculus and geometry. One just filled it with water and poured it into a measuring cup!
Mike Dunlavey
@aip.cd.aish: Use process explorer from sysinternals/technet, on the process properties page | threads tab, pick a thread and hit the stack button. You get a snapshot from when you pushed the button. look at what is on top closing it and doining it again, you can easily get a new sample nearly every second. Of course without setting up symbols you only get offsets. but you also get to see into the system Dlls this way.
DanO
@DanO: I wasn't aware of the process explorer, thanks. It's nice that you can see into system Dlls, but what I look for is mid-stack calls in code that I have control over. Example: http://stackoverflow.com/questions/406760/whats-your-most-controversial-programming-opinion/1562802#1562802
Mike Dunlavey