ansaurus

Question

Is repeatedly calling size() on a container (during loop) bad?

Answer 1

+7 A:

vector::size() is constant-time and usually implemented as a trivial inline function that is optimised away. Don't bother hand-optimising it.

Marcelo Cantos 2010-07-11 05:34:45

I'm not gonna downvote, but I dont agree, most often its implemented as a subtraction, and in performance critical sections it might be worth to optimize.

Viktor Sehr 2010-07-11 16:03:01

As a subtraction AND a division (sizeof the datatype), which makes it even worse..

Jim Buck 2010-07-11 16:26:33

(assuming the division isn't simple enough to be implemented by shifts/etc.)

Jim Buck 2010-07-11 16:27:23

+1: I've done bench marks and it turns out the compiler will emit **identical** code for this type of thing (because of an argument during this post: http://stackoverflow.com/questions/535223/why-cant-i-push-this-object-onto-my-stdlist/535233#535233). The standard dictates that size() is a `const` function. So as long as you only use `const` operations in the loop, it can know that it only needs to call it once. And by looking at the assembly, it does.

Evan Teran 2010-07-11 18:08:35

@Evan: Thank you for sparing me the effort of arguing the point myself :-). I will add that it depends on what you do inside the loop. If it makes a non-inlined function call, there's a good chance the optimizer will give up the fight. But even then, the vast majority of types are a power of two in size, which makes the size calculation extremely cheap, and even if it isn't, a good compiler will be smart enough to multiply instead of dividing (g++ is, at least).

Marcelo Cantos 2010-07-12 01:46:21

Answer 2

+8 A:

I remember reading in Meyers that it will be quadratic instead of linear because the vector doesn't know its size and repeatedly has to count.

You're getting vector and list confused. vector's size value is held in the vector; list's requires transversal of the actual list.

Billy ONeal 2010-07-11 05:34:56

Just out of curiosity, why can't the same principle be applied to std::list? That is, store the list's size, and increment/decrement accordingly.

Steve Guidi 2010-07-11 05:37:06

@Steve: Because `std::list::splice` is required to be constant time, and has to work with arbitrary iterators. You'd have to transverse the list to get the distance there, which would make it linear time, not constant.

Billy ONeal 2010-07-11 05:41:39

Thank you for the explanation!

Steve Guidi 2010-07-11 05:44:03

@Steve: You're welcome. A better explanation is in Effective STL Item 4: Call `empty` instead of checking size against zero.

Billy ONeal 2010-07-11 05:45:20

Edit on the above: `splice` isn't *required* to be constant time, but it *may* be. If `splice` *is* implemented to be constant time, `size` cannot be. If `size` is implemented to be constant, `splice` cannot be. Which is chosen depends on your particular implementation of the STL.

Billy ONeal 2010-07-11 05:49:50

List **might** require traversal of the actual list. (And I believe size is required to be constant in C++0x).

jalf 2010-07-11 06:46:20

@jalf: meaning we won't have a constant time splice ? That seems a bit of a waste given that `splice` is one of the rare reason to use a list in the first place...

Matthieu M. 2010-07-11 11:50:03

@Matthieu: yep. On the other hand, it gives us a consistent guarantee about `size()`. Anyway, I could be wrong, I haven't checked the draft standard.

jalf 2010-07-11 13:44:26

@Matthiew and @Jalf: If you look at the current c++0x draft, the table for all sequence containers requires `size` to be constant time. In `list`'s description, it says it supports **most** of the generalized sequence container requirements, but not all. But it also **requires** `splice` to be constant in 23.3.4.4, which means the generalized complexity requirement on sequence containers has to not be met here.

Billy ONeal 2010-07-11 13:45:24

@Billy: thanks for this precision!

Matthieu M. 2010-07-12 06:30:33

@Matthieu @Billy: Actually, only the `splice()` overloads that take an entire list or a single element are required to run in constant time (which makes sense, since you have the new size available to you). The `splice()` overload that takes a range from another list only has to run in linear time. Since the `list` specification does not state otherwise, I believe its `size()` must have constant time complexity, per the optional sequence requirements. I'm not 100% sure on how to parse the word "most" in §23.3.4 though :-)

James McNellis 2010-07-12 18:58:27

@James: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3090.pdf page 778: `Complexity: Constant time if otherwise, linear time.`

Billy ONeal 2010-07-12 20:17:26

@Billy: Right; if you are splicing within the same list, the size doesn't change; if you are splicing a range from another list, the complexity is linear because it has to count the number of elements inserted.

James McNellis 2010-07-12 20:29:43

Answer 3

+2 A:

The easiest way to tell if something is being optimized out by the compiler is to compare the assembly-language compiler output.

That said, the two chunks of code are not actually equivalent. What if the size of the vector changes while you're iterating over it? The compiler would have to be very, very smart to prove conclusively that the vector's size could not change.

Now, in the real world, is this tiny optimization really worth the extra effort? The vec.size() just returns a stored value. It doesn't re-count the length.

Borealid 2010-07-11 05:35:47

I see this advice all the time, and it has one fatal flaw: many of us do not know assembly language, and therefore would not be able to make sense of the compiled output. If you know assembly, then great. On the other hand, if we knew assembly, we'd probably not be asking the question in the first place.

Billy ONeal 2010-07-11 05:37:28

@Billy I can't imagine doing C++ programming on a platform I didn't know at least the basics of assembly language for - if you don't know it, take a few days off to learn the basics, it's well worth the effort.

anon 2010-07-11 07:35:38

@Borealid: the compiler can actually know conclusively if the size **can't change**, and easily. All it needs to do is guarantee that any operations on `vec` inside the loop are `const` (and don't reference non-const variables created outside the loop since they may alias `vec`). So simple things like summation, printing, etc that are easily read only operations are trivial to determine the const-ness of the loop as a whole.

Evan Teran 2010-07-11 18:13:18

@Evan Teran: And what if there are multiple threads accessing the structure, or it is resides in a shared memory region? It's not that simple.

Borealid 2010-07-11 21:51:50

@Borealid: c++03 is entirely unaware of threads and shared memory. And compilers for the most part ignore there existence with a few exceptions. As far as optimizations go, it is entirely "ok" for a compiler to optimize as if those are not an issue, and except for a few select circumstances that's what they do.

Evan Teran 2010-07-12 02:06:51

@Boreliad: if the size can change during the loop, not re-reading size is the least of your problems. And as Evan said, compilers are supposed to ignore that possibility unless instructed otherwise.

peterchen 2010-07-12 07:30:18

Answer 4

+1 A:

Consider the following stupid function:

void sum (vector<int>& vec, int* sumOut)
{
    *sumOut = 0;
    for(std::size_t i = 0; i < vec.size(); ++i)
    {
        *sumOut += vec[i];      
    }
}

The actual assembly generated will depend on the compiler and implementation of vector, but I think in most cases, the compiler has to re-read the vector's size from memory each time through the loop. This is because the sumOut pointer could potentially overlap (alias) the vector's internal storage of the size (assuming the vector stores its size in an int), so the size could be changed by the loop. If you call a function like this a lot, it could add up to a lot of cycles because you're touching memory more than you need.

Three possible solutions are:

Store the size in a local variable. Ideally, the size this will get stored in a register and avoid touching memory altogether. Even if it has to get put on the stack, the compiler should be able to order the loads/stores more efficiently.
Use __restrict on the output pointer. This tells the compiler that the pointer can't possibly overlap anything else, so writes to it don't require reloading anything else.
Reverse the loop. The termination condition now checks against 0 instead, so vec.size() is never called again.

Of those, I think #1 is the cleanest, but some people might prefer #3. #2 is the probably least reader-friendly, but might be faster than the others (because it means the vector's data could be read more efficiently).

For more info on aliasing, see Christer Ericson's GDC presentation on memory optimization; there's an example almost identical to this in there.

celion 2010-07-11 10:25:26

6502 2010-07-13 20:46:48

ansaurus

tags:

views:

answers:

Is repeatedly calling size() on a container (during loop) bad?

related questions