views:

776

answers:

12

These for-loops are among the first basic examples of formal correctness proofs of algorithms. They have different but equivalent termination conditions:

1   for ( int i = 0; i != N; ++i )

2   for ( int i = 0; i < N; ++i )

The difference becomes clear in the postconditions:

  • The first one gives the strong guarantee that i == N after the loop terminates.

  • The second one only gives the weak guarantee that i >= N after the loop terminates, but you will be tempted to assume that i == N.

If for any reason the increment ++i is ever changed to something like i += 2, or if i gets modified inside the loop, or if N is negative, the program can fail:

  • The first one may get stuck in an infinite loop. It fails early, in the loop that has the error. Debugging is easy.

  • The second loop will terminate, and at some later time the program may fail because of your incorrect assumption of i == N. It can fail far away from the loop that caused the bug, making it hard to trace back. Or it can silently continue doing something unexpected, which is even worse.

Which termination condition do you prefer, and why? Are there other considerations? Why do many programmers who know this, refuse to apply it?

+1  A: 

In C++, using the != test is preferred for generality. Iterators in C++ have various concepts, like input iterator, forward iterator, bidirectional iterator, random access iterator, each of which extends the previous one with new capabilities. For < to work, random access iterator is required, whereas != merely requires input iterator.

Chris Jester-Young
+1  A: 

We shouldn't look at the counter in isolation - if for any reason someone changed the way the counter is incremented they would change the termination conditions and the resulting logic if it's required for i==N.

I would prefer the the second condition since it's more standard and will not result in endless loop.

Svet
+1  A: 

If you trust your code, you can do either.

If you want your code to be readable and easily understood (and thus more tolerant to change from someone who you've got to assume to be a klutz), I'd use something like;

for ( int i = 0 ; i >= 0 && i < N ; ++i)
Unsliced
A: 

I always use #2 as then you can be sure the loop will terminate... Relying on it being equal to N afterwards is relying on a side effect... Wouldn't you just be better using the variable N itself?

[edit] Sorry...I meant #2

tim_yates
A: 

In general I would prefer

for ( int i = 0; i < N; ++i )

The punishment for a buggy program in production, seems a lot less severe, you will not have a thread stuck forever in a for loop, a situation that can be very risky and very hard to diagnose.

Also, in general I like to avoid these kind of loops in favour of the more readable foreach style loops.

Sam Saffron
A: 

I think most programmers use the 2nd one, because it helps figure out what goes on inside the loop. I can look at it, and "know" that i will start as 0, and will definitely be less than N.

The 1st variant doesn't have this quality. I can look at it, and all I know is that i will start as 0 and that it won't ever be equal to N. Not quite as helpful.

Irrespective of how you terminate the loop, it is always good to be very wary of using a loop control variable outside the loop. In your examples you (correctly) declare i inside the loop, so it is not in scope outside the loop and the question of its value is moot...

Of course, the 2nd variant also has the advantage that it's what all of the C references I have seen use :-)

Allan Mertner
A: 

I prefer to use #2, only because I try not to extend the meaning of i outside of the for loop. If I were tracking a variable like that, I would create an additional test. Some may say this is redundant or inefficient, but it reminds the reader of my intent: At this point, i must equal N

@timyates - I agree one shouldn't rely on side-effects

Nescio
+3  A: 

I tend to use the second form, simply because then I can be more sure that the loop will terminate. I.e. it's harder to introduce a non-termination bug by altering i inside the loop.

Of course, it also has the slightly lazy advantage of being one less character to type ;)

I would also argue, that in a language with sensible scope rules, as i is declared inside the loop construct, it shouldn't be available outside the loop. This would mitigate any reliance on i being equal to N at the end of the loop...

Martin McNulty
I was thinking the same thing, surely most languages would have i only at the loop scope. +1
Shaun Austin
In Perl: `for( my $i=0; $i<N; ++$i ){...}`
Brad Gilbert
A: 

I think you stated very well the difference between the two. I do have the following comments, though:

  • This is not "language-agnostic", I can see your examples are in C++ but there are languages where you are not allowed to modify the loop variable inside the loop and others that don't guarantee that the value of the index is usable after the loop (and some do both).

  • You have declared the i index within the for so I would not bet on the value of i after the loop. The examples are a little bit misleading as they implictly assume that for is a definite loop. In reality it is just a more convenient way of writing:

    // version 1
    { int i = 0;
      while (i != N) {
        ...
        ++i;
      }
    }
    

    Note how i is undefined after the block.

If a programmer knew all of the above would not make general assumption of the value of i and would be wise enough to choose i<N as the ending conditions, to ensure that the the exit condition will be eventually met.

Remo.D
A: 

Using either of the above in c# would cause a compiler error if you used i outside the loop

Adrian
A: 

I prefer this sometimes:

for (int i = 0; (i <= (n-1)); i++) { ... }

This version shows directly the range of values that i can have. My take on checking lower and upper bound of the range is that if you really need this, your code has too many side effects and needs to be rewritten.

The other version:

for (int i = 1; (i <= n); i++) { ... }

helps you determine how often the loop body is called. This also has valid use cases.

xmjx
A: 

For general programming work I prefer

for ( int i = 0; i < N; ++i )

to

for ( int i = 0; i != N; ++i )

Because it is less error prone, especially when code gets refactored. I have seen this kind of code turned into an infinite loop by accident.

That argument made that "you will be tempted to assume that i == N", I don't believe is true. I have never made that assumption or seen another programmer make it.

cdiggins