views:

809

answers:

13

What is the preferred method of writing loops according to efficiency: Way a)

   /*here I'm hoping that compiler will optimize this  
 code and won't be calling size every time it iterates through this loop*/
    for (unsigned i = firstString.size(); i < anotherString.size(), ++i)
    {
    //do something
    }

or maybe should I do it this way: Way b)

unsigned first = firstString.size();
unsigned second = anotherString.size();

and now I can write:

    for (unsigned i = first; i < second, ++i)
    {
    //do something
    }

the second way seems to me like worse option for two reasons: scope polluting and verbosity but it has the advantage of being sure that size() will be invoked once for each object.
Looking forward to your answers.

+13  A: 

I would first use the first version, simply because it looks cleaner and easier to type. Then you can profile it to see if anything needs to be more optimized.

But I highly doubt that the first version will cause a noticable performance drop. If the container implements size() like this:

inline size_t size() const
{
    return _internal_data_member_representing_size;
}

then the compiler should be able to inline the function, eliding the function call. My compiler's implementation of the standard containers all do this.

In silico
+3  A: 

This is one of those things that you should test yourself. Run the loops 10,000 or even 100,000 iterations and see what difference, if any, exists.

That should tell you everything you want to know.

Chris Lively
+12  A: 

In general, let the compiler do it. Focus on the algorithmic complexity of what you're doing rather than micro-optimizations.

However, note that your two examples are not semantically identical - if the body of the loop changes the size of the second string, the two loops will not iterate the same amount of times. For that reason, the compiler might not be able to perform the specific optimization you're talking about.

Terry Mahaffey
+1 for pointing out that the two snippets of code don't perform the same operation.
High Performance Mark
+8  A: 

How will a good compiler optimize your code? Not at all, as it can't be sure size() has any side-effects. If size() had any side effects your code relied on, they'd now be gone after a possible compiler optimization.

This kind of optimization really isn't safe from a compiler's perspective, you need to do it on your own. Doing on your own doesn't mean you need to introduce two additional local variables. Depending on your implementation of size, it might be an O(1) operation. If size is also declared inline, you'll also spare the function call, making the call to size() as good as a local member access.

Johannes Rudolph
Having size() be an O(1) operation is easy even with mutable strings.
sepp2k
@sepp2k: Yeah, but you somehow need to lazily evaluate the size after the string's content changed (i.e using a dirty flag). Thus it's not _always_ O(1), whereas immutable strings could compute their size on creation. Still, it's irrelevant for this question.
Johannes Rudolph
@JohannesRudolph: Do I? What's wrong with just updating the length with each destructive operation? E.g. when appending a string of size n, `length += n`. Is there an operation where you can't easily update the length in O(1) time while performing the operation? (Yes, it's irrelevant, just asking).
sepp2k
@sepp2k:All this effort for "always O(1)"? You're right, it's probably possible, I haven't thought about it in detail. I've heard there's a saying that goes like 'every C++ programmer has written his own string implementation' ;-)
Johannes Rudolph
@Johannes: You make it sound like it's difficult to have `size()` be O(1). `std::string::size()` is O(1).
GMan
Umm... I'm pretty sure the compiler _can_ in _some_ situations, determine that the call to size has no side effects...
Longpoke
considering that size is a `const` function, the compiler can know that it has no side effects on the string...
Evan Teran
@GMan: Maybe it isn't difficult, I don't know. I have never looked into writing a string implementation. Also, I don't understand why my side note in (brackets) has caused so much discussion here. In order we do not further pollute this thread here, I will remove that note.
Johannes Rudolph
@Evan Teran: Very good catch, I haven't thought about that, but I guess you're right. Oh man, I really miss const-correct programming.
Johannes Rudolph
@Evan Teran: The fact that it is a template and the compiler can see the definition is what tells it that there will be no side effects. If the compiler was not able to read the internals of the function, then even if it is `const` it cannot elide the call and keep the *as-if* guarantee (think that a constant operation can modify a mutable subobject or state outside of the object as in globals or static variables)
David Rodríguez - dribeas
@David Rodríguez - dribeas: I see what you as saying about the "as-if" guarantee, but I'm not sure that is 100% correct. `mutable` things are still supposed to be logically const (doesn't change anything as far as the public interface is concerned). So even if `size()` did change a `mutable` item, i think it would be fair game for the compiler to consider it to have no side effects since logically calling `size()` shouldn't have any **visible** effects.
Evan Teran
@Evan Teran: The issue of other side effects still remain. In particular, the compiler cannot know whether a static attribute or a global variable is modified. Not only that, the compiler cannot possibly know whether the state retrieved by the function may depend on external modifications, think a function with a pointer to `volatile` that only reads the value, the standard requires that the read is *performed* for every invocation of the function, regardless of whether the member function is constant or not.
David Rodríguez - dribeas
@David Rodríguez - dribeas:You make a very good point, certainly there are cases where the compiler can't know. Usage of function pointers,globals and volatile variables are good examples. But just because it can't be done in the general case, doesn't mean that there aren't many common cases where it can work. For example, the compiler can know that a function **only** references class local data. Which is a class that something as simple as `size()` should fit into nicely. Common code doesn't tend to use the techniques you mention, meaning that the compiler can often make clever optimizatons.
Evan Teran
Looking at your first post to me, I do see your point about templates, the compiler will need access to the implementation to obtain the knowledge I'm talking about. Perhaps a smart compiler could encode the "const-ness" of a function into the object file somehow so that link time optimizations could take advantage of this knowledge. But I doubt that's the case :-P.
Evan Teran
@Evan Teran: I think it would be a safe optimization only in case the compiler could be sure the function is a true function in the mathematical sense with no side effects (sometimes called pure functions). Since this concept is not baked into the language (as opposed to e.g. Haskell) this optimization is not safe in the general case. The specific case here could also be "whitelisted" into a "safe to optimize list" by the compiler vendor, since they know about the implementation...
Johannes Rudolph
@Johannes Rudolph: Good point. Also noteworthy is the "pure" and "const" attributes that gcc supports for functions which helps the optimizer make this determination
Evan Teran
+1  A: 

Here's how I look at it. Performance and style are both important, and you have to choose between the two.

You can try it out and see if there is a performance hit. If there is an unacceptable performance hit, then choose the second option, otherwise feel free to choose style.

Moe
+6  A: 

Don't pre-optimize your code. If you have a performance problem, use a profiler to find it, otherwise you are wasting development time. Just write the simplest / cleanest code that you can.

Brent Arias
Generally, I'd say that writing the cleanest code you can will make things easier for an optimizer as well.
Mattias Nilsson
+37  A: 

i usually write this code as:

/* i and size are local to the loop */
for (size_t i = firstString.size(), size = anotherString.size(); i < size, ++i) {
  //do something
}

this way i do not pollute the parent scope and avoid calling anotherString.size() for each loop iteration.

this is especially useful for iterators:

for(some_generic_type<T>::forward_iterator it = collection.begin(), end = collection.end();
    it != end; ++it) {
   // do something with *it
}
knittl
Kicking myself now for never once thinking of doing this in 11 years!
brone
+1 Somehow I've always missed the fact you can initialize multiple variables in a for loop.
Chris Lively
comes in very handy with iterators, if you don't want to use for_each ;)
knittl
It's not really a big problem to call anotherString.size() on each iteration. The size is just going to be a (probably inline) member access on the string.Using something like strlen in the loop is of course a problem because it has to iterate the entire string on each loop.
Mike Weller
size_t is your friend...
Longpoke
@longpoke: done!
knittl
+1  A: 

I'm hoping that compiler will optimize this...

You shouldn't. Anything involving

  • A call to an unknown function or
  • A call to a method that might be overridden

is hard for a C++ compiler to optimize. You might get lucky, but you can't count on it.

Nevertheless, because you find the first version simpler and easier to read and understand, you should write the code exactly the way it is shown in your simple example, with the calls to size() in the loop. You should consider the second version, where you have extra variables that pull the common call out of the loop, only if your application is too slow and if you have measurements showing that this loop is a bottleneck.

Norman Ramsey
C++ does not have "unknown functions", or "known functions", come to that.
anon
Maybe the compiler could know that size() function for a STL string has no effect since compilers can offer their own implementation of the STL.
Kamchatka
@Norman, but std::string is a specialization of std::basic_string, which is a template, so it will already have the definition available.
Michael Aaron Safyan
@Michael, sounds like a known function then.
Norman Ramsey
@Neil: "known" and "unknown" are compiler terms, not language terms. A function is "known" at a call site if the compiler has the body of every function that could be called. Otherwise it is "unknown". The same function may be "known" or "unknown" in different compilers depending on the degree of optimization across compilation units. My understanding is that most C++ compilers treat functions defined in other .cpp files as unknown. A function instantiated from a template will be known.
Norman Ramsey
+1  A: 

My recommendation is to let inconsequential optimizations creep into your style. What I mean by this is that if you learn a more optimal way of doing something, and you cant see any disadvantages to it (as far as maintainability, readability, etc) then you might as well adopt it.

But don't become obsessed. Optimizations that sacrifice maintainability should be saved for very small sections of code that you have measured and KNOW will have a major impact on your application. When you do decide to optimize, remember that picking the right algorithm for the job is often far more important than tight code.

dicroce
+1 for pointing out the *algorithm > coding* for performance thing.
JUST MY correct OPINION
A: 

You shouldn't optimize your code, unless you have a proof (obtained via profiler) that this part of code is bottleneck. Needless code optimization will only waste your time, it won't improve anything.

You can waste hours trying to improve one loop, only to get 0.001% performance increase. If you're worried about performance - use profilers.

SigTerm
You also shouldn't pessimize code. You can also waste hours trying to cite Knuth and advances in optimization technology, only to get a 0.000% performance increase.
peterchen
I'm increasingly of the opinion that the "never microoptimize" camp is just as wrong as "microptimize freely". You really should never say never.
John Dibling
@John Dibling: I meant to say that you should optimize things only based on data provided by tools (profilers), not blindly, by "gut feeling" or something. If you thing that a loop is going to be slow, you should measure how much time exactly it takes, and how much extra time you get by modifying things. Also, it makes sense to look at bigger picture. There is no point in 200% performance increase in function that takes less than 0.5% of total program execution time.
SigTerm
A: 

There's nothing really wrong with way (b) if you just want to write something that will probably be no worse than way (a), and possibly faster. It also makes it clearer that you know that the string's size will remain constant.

The compiler may or may not spot that size will remain constant; just in case, you might as well perform this optimization yourself. I'd certainly do this if I was suspicious that the code I was writing was going to be run a lot, even if I wasn't sure that it would be a big deal. It's very straightforward to do, it takes no more than 10 extra seconds thinking about it, it's very unlikely to slow things down, and, if nothing else, will almost certainly make the unoptimized build run a bit more quickly.

(Also the first variable in style (b) is unnecessary; the code for the init expression is run only once.)

brone
A: 

For the "std::size_t size()const" member function which not only is O(1) but is also declared "const" and so can be automatically pulled out of the loop by the compiler, it probably doesn't matter. That said, I wouldn't count on the compiler to remove it from the loop, and I think it is a good habit to get into to factor out the calls within the loop for cases where the function isn't constant or O(1). In addition, I think assigning the values to a variable leads to the code being more readable. I would not suggest, though, that you make any premature optimizations if it will result in the code being harder to read. Again, though, I think the following code is more readable, since there is less to read within the loop:

 std::size_t firststrlen = firststr.size();
 std::size_t secondstrlen = secondstr.size();
 for ( std::size_t i = firststrlen; i < secondstrlen; i++ ){
      // ...
 }

Also, I should point out that you should use "std::size_t" instead of "unsigned", as the type of "std::size_t" can vary from one platform to another, and using "unsigned" can lead to trunctations and errors on platforms for which the type of "std::size_t" is "unsigned long" instead of "unsigned int".

Michael Aaron Safyan
A: 
  1. How much percent of time is spent in for as opposed to // do something? (Don't guess - sample it.) If it is < 10% you probably have bigger issues elsewhere.

  2. Everybody says "Compilers are so smart these days." Well they're no smarter than the poor coders who write them. You need to be smart too. Maybe the compiler can optimize it but why tempt it not to?

Mike Dunlavey