ansaurus

Question

c++ performance issue beginner (size function in for loop)

Answer 1

+2 A:

It's 'called' each time, but I put called into quotes because it really probably is just an inline method call, so you don't have to worry about its performance.

Why not use vector<int>::iterator instead?

Daniel Mošmondor 2010-10-10 18:40:14

"vector<int>::iterator" is a lot more verbose than "int" -- without providing any real value. Although, as written, the OP probably gets a signed/unsigned comparison warning with int vs. vector<int>::size_type.

nobar 2010-10-10 18:53:59

@nobar: I think iterators provide massive benefits with zero down side. I am sorry you feel that typing a few characters as a burden. Since the whole STL is based on iterators learning to use them correctly is a nessesty.

Martin York 2010-10-10 21:45:21

@Martin: The C++ standards committee is also sorry, which is why they've provided range-based for in C++0x as a replacement for many cases of `for_each` and the other very simple algorithms. Except I think their sympathy is more sincere ;-p

Steve Jessop 2010-10-10 22:06:26

@Steve Jessop: Not being argumentative but any reference to why they think iterators were bad? I see Iterators as a step forward (not a perfect step but a step). After a decade of using we now understand the problems better and have a better understanding of the strengths (and weaknesses) of iterators. The range concept is just another step that is built on-top of iterators, I don;t see them as a replacement I see them as an extension of the basic underlying concept.

Martin York 2010-10-10 22:44:42

I was hoping that concepts would have made it into the standard as this (by my understanding (though I have not read up on why (because was dropped))) would have made iterators easier to extend and use and debug.

Martin York 2010-10-10 22:47:33

I almost forget. The standards committee agreeds with @nobar that too much typing was arduous on us poor underpaid developers and introduced the 'auto' type concept (to increase our pay per keystroke by reducing the number of key strokes required).

Martin York 2010-10-10 22:50:02

@Martin: Good point about the use of "auto" -- that should simplify the use of iterators in the future. Also, I don't disagree that iterators are useful for many purposes, but I write a lot of for loops that iterate over vectors and I happen to prefer the indexed approach in most cases. One reason might be that I don't want to type so much, but a bigger reason is probably that I don't want to have to read it. Also, the accessor syntax is different depending on which approach you take -- so that's another thing to consider when deciding which way to go.

nobar 2010-10-11 00:06:35

@nobar: I agree with not wanting to type or read it. That is why I nearly always (99% of the time) use a std::algorithm that will do the looping for me.

Martin York 2010-10-11 00:57:47

@Martin: I don't mean they've renounced iterators, just that they genuinely are sorry that typing a few characters is such a burden. Ranged-based for is pretty much just to reduce code, regardless of whether you were previously using the equivalent loop, or the equivalent `for_each`. Of course Alexandrescu specifically does think that "something, somewhere, went terribly wrong" with iterators, and that ranges should replace the concept, not extend it. But then in a sense he's a C++ "deserter" anyway.

Steve Jessop 2010-10-11 01:36:28

@Martin: I think it depends on the type of software that you are developing. For me, most of my vectors are not containers of "objects", but just large arrays of simple data (such as image pixels). Most of the for loops that I write are part of a custom algorithm that I am writing, and it seems to me that using std::for_each() would be much more cumbersome for my particular purposes -- especially, since I typically use the index value to also access neighboring pixels.

nobar 2010-10-11 02:28:48

...Also, most of the people that I work with don't know the first thing about STL (they have other domains of expertise of course), so overuse would just needlessly confuse them. This fact also makes me glad that there are people on StackOverflow for me to converse with about this stuff. :)

nobar 2010-10-11 02:31:04

@nobar: anyway my implementation of `<algorithm>` uses loops in preference to implementing everything on top of `for_each`, and I don't see why I shouldn't too :-)

Steve Jessop 2010-10-11 02:51:52

@Steve: Right on! I took a peek at <algorithm> for GNU C++. It uses lots of loops and lots and lots of iterators. Of course STL algorithms support various containers, so they *have* to use iterators.

nobar 2010-10-11 03:21:07

Answer 2

+1 A:

The size() member function is called each time, but it would be a really bad implementation that wouldn't inline it, and a strange one where it wouldn't be a simple access of a fixed datum or a subtraction of two pointers.
Anyway, you shouldn't worry yourself with such trivialities until you have profiled your application and found out that this is a bottleneck.

However, what you should pay attention to is:

The correct type for a vector's index is std::vector<T>::size_type.
There are types (some iterators, for example) where i++ might be slower than ++i.

Therefore, the loop should be:

for(vector<int>::size_type i=0; i<var.size(); ++i)
  ...

sbi 2010-10-10 18:41:22

@Zack: Thanks for doing so!

sbi 2010-10-11 07:06:59

Answer 3

+1 A:

It must be called everytime because size() might return a different value everytime.

Therefore there's no big choice it simply must be.

Vinzenz 2010-10-10 18:42:04

This answer is correct in the most general sense (the resulting code *must* behave as if it were called every time), but compiler writers work *very* hard at detecting the special cases where it can be safely factored out.

dmckee 2010-10-10 18:47:24

That's true ;-) However you cannot rely on this as this is compiler specific.

Vinzenz 2010-10-10 18:56:34

Answer 4

+11 A:

In theory, it is called each time, since a for loop:

for(initialization; condition; increment)
    body;

is expanded to something like

{
    initialization;
    while(condition)
    {
        body;
        increment;
    }
}

(notice the curly braces, because initialization is already in an inner scope)

In practice, if the compiler understands that a piece of your condition is invariant through all the duration of the loop and it does not have side-effects, it can be smart enough to move it out. This is routinely done with strlen and things like that (that the compiler knows well) in loops where its argument isn't written.

The compiler may be helped in this optimization also by the various const modifiers that may be present: for example, if you're manipulating a vector via a const reference, the compiler can exploit this information to be sure that its fields will never change.

Doing that optimization by hand is worthy if you know that a part of your condition is "expensive" to evaluate (and such condition isn't, since it usually boils down to a pointer subtraction, which is almost surely inlined).

Edit: as others said, in general with containers it's better to use iterators, but for vectors it's not so important, because random access to elements via operator[] is guaranteed to be O(1); actually with vectors it usually is a pointer sum (vector base+index) and dereference vs the pointer increment (preceding element+1) and dereference of iterators. Since the target address is still the same, I don't think that you can gain something from iterators in terms of cache locality (and even if so, if you're not walking big arrays in tight loops you shouldn't even notice such kind of improvements).

For lists and other containers, instead, using iterators instead of random access can be really important, since using random access may mean walk every time the list, while incrementing an iterator is just a pointer dereference.

The use of vector<int>::size_type instead of just int is important too, because you gain that your code can actually walk all the elements the vector can contain and avoid warnings (and potential logic errors at runtime) about signed/unsigned comparisons.

Matteo Italia 2010-10-10 18:43:36

excellent answer

Robert Karl 2010-10-10 18:45:09

@Robert Karl: Thank you :)

Matteo Italia 2010-10-10 18:46:15

Actually... shouldn't the increment be inside the while loop?

ronag 2010-10-10 18:51:47

@ronag: whoops, you're right, fixed. :)

Matteo Italia 2010-10-10 18:54:40

"if you're manipulating a vector via a const reference, the compiler can exploit this information to be sure that its fields will never change". Not unless the vector object itself (not just the reference) is const. If you call code that could potentially modify the vector through an alias, then the compile cannot optimise even if *your* reference is const. If you don't call to unknown code, then the compiler is allowed to optimise even if your reference is non-const.

Steve Jessop 2010-10-10 21:57:59

Uh, yes, you're right indeed. :S

Matteo Italia 2010-10-10 22:08:33

Answer 5

+1 A:

As other have said

the semantics must be as if it were called each time
it is probably inlined, and is probably a simple function

on top of which

a smart enough optimizer may be able to deduce that it is a loop invariant with no side effects and elide it entirely (this is easier if the code is inlined, but may be possible even if it is not if the compiler does global optimization)

dmckee 2010-10-10 18:44:49

Answer 6

A:

But it could be done in this way (providing that this loop intends to only read/write without actually changing the size of a vector):

for(vector<int>::size_type i=0, size = var.size(); i < size; ++i) 
{
//do something
}

In the loop above you have just one call to size independently from size being inlined or not.

There is nothing we can do 2010-10-10 18:53:42

Answer 7

+1 A:

I think that if the compiler can conclusively deduce that the variable var is not modified inside the "loop body"

for(int i=0; i< var.size();i++) { 
    // loop body
}

then the above may be transposed to something equivalent of

const size_t var_size = var.size();
for( int i = 0; i < var_size; i++ ) { 
    // loop body
}

However, I am not absolutely sure, so comments are welcome :)

Also,

In most situations, the size() member function is inlined, so the issue does not warrant worrying
The concern is perhaps equally applicable to the end() which is always used for iterator based looping, i.e. it != container.end()
Please consider using size_t or vector<int>::size_type for the type of i [See Steve Jessop's comment below.]

ArunSaha 2010-10-10 19:12:34

The return type of `std::vector<int>::size()` is `std::vector<int>::size_type`, which you would strongly expect to be the same type as `size_t`, but needn't be.

Steve Jessop 2010-10-10 22:09:00

@Steve Jessop: You are right, edited accordingly.

ArunSaha 2010-10-10 23:49:23

Answer 8

A:

Hi, as others said, The compiler shall decide what to do with the actual code written. The key figure is that it is called each time. But if you want to get a performance boost, it is best to write your code with some considerations. Your case is one of them, there are others as well, like the difference between these two pieces of code:

for (int i = 0 ; i < n ; ++i)
{
   for ( int j = 0 ; j < n ; ++j)
       printf("%d ", arr[i][j]);
   printf("\n");
}
for (int j = 0 ; j < n ; ++j)
{
   for ( int i = 0 ; i < n ; ++i)
       printf("%d ", arr[i][j]);
   printf("\n");
}

The difference is that the first one will not change the ram page too much per references, but the other will exhaust your cache and TLB and other stuff.

Also inline won't help that much! because the order of the calling function will remain as n(size of the vector) times. It helps in some places though, but the best thing is to rewrite your code.

But! if you want to let a compiler do it's optimizations over your code NEVER put volatile, like so:

for(volatile int i = 0 ; i < 100; ++i)

It prevents the compiler from optimizing. If you need another hint for performance use register instead of volatile.

for(register int i = 0 ; i < 100; ++i)

The compiler will try not to move i from the CPU-registers to RAM. It is not ensured that it can do it, but it will do it's best ;)

Green Code 2010-10-10 19:42:23

I can't think of any compiler where register is actually taken into consideration... the compiler will do its own register choices.

ronag 2010-10-10 20:05:35

Ofcourse the inline will help... since it will probably be inlined to the size member variable, thus no function call...

ronag 2010-10-10 20:06:27

Also, even though you are correct about the cache locality... it has nothing to do with the question asked...

ronag 2010-10-10 20:07:43

@ronag: I guess that is the wrong idea, I did not say inline won't help, i just said rewriting the code is better. Also it is also compilers choice to inline the function or not. I just answered his question this way because i thought he was curious about how to make for loops better.

Green Code 2010-10-10 21:28:00

How would re-writing the code be better? Any decent compiler will make a better decision regarding these micro optimization than either of us.

ronag 2010-10-10 21:29:29

Also what is register keyword made for then? C++ was making keywords to do what?

Green Code 2010-10-10 21:29:29

The register keyword was useful in a time when compiler optimization were not as good as today.

ronag 2010-10-10 21:32:24

@Green Code: C++ didn't invent the `register` keyword, by the way, it inherited it from `C`.

Steve Jessop 2010-10-10 22:11:35

ansaurus

tags:

views:

answers:

c++ performance issue beginner (size function in for loop)

related questions