views:

1382

answers:

11

I read the following in a review of Knuth's "The Art of Computer Programming":

"The very 'practicality' means that the would-be CS major has to learn Kernighan's mistakes in designing C, notably the infamous fact that a for loop evaluates the for condition repeatedly, which duplicates while and fails to match the behavior of most other languages which implement a for loop."

(http://www.amazon.com/review/R9OVJAJQCP78N/ref=cm_cr_pr_viewpnt#R9OVJAJQCP78N)

What is this guy talking about? How could you implement a for loop that wasn't just syntactic sugar for a while loop?

+4  A: 

He probably refers to for loops like for i:=0 to N and for-each loops that iterate over the elements of a set. I suspect all languages that have a C-style for loop actually got it from C.

mweerden
Bill the Lizard
Plus one. For example, BASIC For loops.
Satish Motwani
+26  A: 

Consider this:

for i:=0 to 100 do { ... }

In this case, we could replace the final value, 100, by a function call:

for i:=0 to final_value() do { ... }

... and the final_value-function would be called only once.

In C, however:

for (int i=0; i<final_value(); ++i) // ...

... the final_value-function would be called for each iteration through the loop, thus making it a good practice to be more verbose:

int end = final_value();
for (int i=0; i<end; ++i) // ...
Magnus Hoff
I don't agree. How do you know the value of final_value() isn't going to return a different value each time? If final_value() only returns a constant then most compilers will optimize this as you say, but it won't always be the case.
Tarski
For this example I assume that you actually only want to call final_value once. This is cumbersome to express in C.It might very well be that you want final_value to be called every time, but most often I find that I only want it called once.
Magnus Hoff
@Tarski: and the compile cannot optimize away the call to final_value(), because it cannot know if the return value is stable. Consider "for(i=0; i<strlen(str); ++i)" Now, you may or may not be changing the length of str in your loop, (but 99.99% of the time, you won't be)
James Curran
Compiler will optimize it away if final_value() lives in the same source file as the loop.
Jim Buck
@Jim: You're making assumptions about the complexity of final_falue() and how much the optimizer can/will do. Some optimizers cross compilation units, others are limited even within one source file.
Darron
Sorry. What I said was incorrect. There will be cases where the compiler will optimize this. Like if final_value() is hardcoded to return 5, and is in the same source file, but also cases where it cannot - so how are other languages any different? like the equivalent of strlen in PASCAL?
Tarski
@Tarski: In Pascal, final_value would be called exactly once because this is the semantics of the for-loop in this language. It is not even considered an optimization. :) (Also, the equivalent of strlen is O(1) in Pascal ;)
Magnus Hoff
@Tarski that Optimization can only happen if the compiler can actually tell the function always returns the same thing... that's frequently not the case.
Spudd86
@James that one the compiler probably will optimize since it probably knows what strlen does.
Spudd86
@spudd86 - that's doubtful, as the compiler would also have to track what happens to the string being measured. Also, it's legal to write your own function called strlen which does something completely different.
James Curran
+2  A: 

Loop unrolling perhaps? If you know how many time the for loop is going to execute you can literally copy and paste the contents of the loop. Most while loops are going to be be based on some condition that isn't a simple counting from 0 to N, so won't be able to use this optimization.

Take this example

int x;
for (x = 10; x != 0; --x)
{
    printf ("Hello\n");
}

I know you would normally do x = 0; x <= 10; ++x, but all will be revealed in the assembly.

some pseudo assembly:

mov 10, eax
loop:
print "hello"
dec eax
jne loop

In this example we keep jumping back around the loop to print "hello" 10 times. However we are evaluating the condition with the jne instruction each time around the loop.

If we unrolled it we could simply put

print "hello"
print "hello"
print "hello"
print "hello"
print "hello"
print "hello"
print "hello"
print "hello"
print "hello"
print "hello"

We wouldn't need any of the other instructions, so it is faster. This is only a simple example - I'll try to find a better one!

Tarski
You mean it *could* be faster? The larger code size could lead to L1/L2 cache misses which then could result in slower code.
Isak Savo
+2  A: 

Magnus has it right, but one should also note that in most languages (pre-C), the conditional is the end criterion (i.e. "stop when i equals 100"). In C (and most post-C languages), it's the continue criterion (i.e., "continue while i is less than 100").

James Curran
but how could you know when i was 100 unless you evaluated it after each loop iteration?
Kip
I'm merely talking syntax, not evaluation.
James Curran
+2  A: 

Basically, the C (and Java, JavaScript, and lot of C-derived languages) for loop is indeed syntactic sugar for a while loop:

for (int i = 0; i < max; i++) { DoStuff(); }

which is a very current idiom is strictly equivalent to:

int i = 0; while (i < max) { DoStuff(); i++; }

(Putting apart scope issues, which changed between versions of C anyway.)

The stop condition is evaluated on each iteration. It can be interesting in some cases, but it can be a pitfall: I saw i < strlen(longConstantString) in a source, which is a major way to slow down a program, because strlen is a costly function in C.
Somehow, the for loop is mostly designed to run a number of times know in advance, you can still use break to terminate early, so the dynamic evaluation of the stop term is more annoying than useful: if you really need dynamic evaluation, you use while () {} (or do {} while ()).

On some other languages, like Lua, the stop condition is evaluated only once, at loop init. It helps predicting loop behavior and it is often more performant.

PhiLho
strlen is special since at least GCC knows it can optimize it away in certain cases. (It's a "pure" function since it always return the same output if its given the same input).
Isak Savo
Your max variable is badly named, better to call it count since the loop won't be executed with i = max.
Mark Baker
+2  A: 

On x86, for looping can be done on assembly level without making a while loop. The loop instruction decreases the value of the register ecx and jumps to the operand if ecx is greater than 0, it does nothing otherwise. This is a classic for loop.

mov ecx, 0x00000010h
loop_start:
;loop body
loop loop_start
;end of loop
DrJokepu
Assembler doesn't really count here.
DJClayworth
This is why some will tell you to loop backwards if possible, i.e. for(i=limit; i>0; i--) {whatever;}. Writing it that way can allow the compiler to make this optimization in the binary.
Kip
If the compiler can prove that each loop iteration is independent of the next, it can make this optimization on its own. I'd say that it's perfectly acceptable to bring assembly into this discussion since C is often called "cross platform assembly".
rmeador
+4  A: 

If all you want is a simple counting loop, then

for (i=0; i<100; i++) dostuff();

will be fine, and the compiler can optimize it.

If you use a function in the continue part of the for statement, like

for (i=0; i<strlen(s); i++) dostuff();

then the function will be evaluated every time and this is usually not a good idea as the function overheads will slow your process. Sometimes it can slow your process to the point of unusability.

If the function's return value will not change during the iteration, extract it from the loop:

slen = strlen(s);
for (i=0; i<slen; i++) dostuff();

But there are times when the function will be returning different values each call, and then you do not want it extracted from the loop:

for (isread(fd, &buffer, ISFIRST);
     isstat(fd) >= 0;
     isread(fd, &buffer, ISNEXT)
{
  dostuff(buffer);
}

and you want it evaluated each time. (That is a slightly contrived example based on work that I do, but it shows the potential).

C gives you the raw ability to roll your loop any way you can. You have to know how your loop is supposed to work, and you optimize it as best you can, depending on your needs.

That last example could have been expressed as a while loop:

isread(fd, &buffer, ISFIRST);
while (isstat(fd) >= 0)
{
  dostuff(buffer);
  isread(fd, &buffer, ISNEXT);
}

but it's not as neat, and if I use a continue in the loop, then I have to call the iterating isread again. Putting the whole thing in a for loop makes it neater, and ensures that the iterating isread is called each loop.

I write lower-level functions so they can be used in for loops like this. It brings all elements of the while loop together so you can understand it more easily.

codebunny
+2  A: 

In Ada (and I believe most other Algol-derived languages) the terminating condition of a "for" loop is evaluated only once, at the beginning of the loop. For example, suppose you have the following code in Ada

q := 10;
for i in 1..q loop
    q := 20;
    --// Do some stuff
end loop;

This loop will iterate exactly 10 times, because q was 10 when the loop started. However, if I write the seemingly equivalent loop in C:

q = 10;
for (int i=0;i<q;i++) {
   q = 20;
   // Do some stuff
}

Then the loop iterates 20 times, because q was changed to 20 by the time i got large enough for it to matter.

The C way is more flexible, of course. However this has quite a few negative implications. The obvious is that the program has to waste effort rechecking the loop condition every cycle. A good optimizer might be smart enough to work around the problem in simple cases like this, but what happens if "q" is a global and "do some stuff" includes a procedure call (and thus in theory could modify q)?

The hard fact is that we just know way more about the Ada loop than we do about the C loop. That means that with the same level of intelligence and effort in its optimizer, Ada can do a lot better job of optimizing. For instance, the Ada compiler knows that it can replace the entire loop with 10 copies of the contents, no matter what those contents are. A C optimizer would have to examine and analyze the contents.

This is actually just one of many ways where the design of the C syntax hamstrings the compiler.

T.E.D.
A: 

Maybe Knuth is referring to BASIC.

In BASIC (the older one, I dont know about VB), FOR loops were implemented differently.

E.g. to loop ten times

FOR N=1 TO 10

...

NEXT N

---- OR ---- to find sum of even numbers below 100

FOR N=0 TO 100 STEP 2

...

NEXT N

In C, you may have any condition in "for" to check for exiting the loop. More flexible, I think.

Satish Motwani
+1  A: 

What these language purist never seem to realize is the whole point of C, and to an extent C++, is giving the possibility to implement what you want and how you want it. Now admittedly if the result of your end expression changes, it would be better to just use a while loop. Maybe the problem is programmers get the impression C implements a "high level" for, but everyone should know C is a pretty low-level language.

FredV
+1  A: 

The very 'practicality' means that the would-be CS major has to learn Kernighan's mistakes in designing C, notably the infamous fact that a for loop evaluates the for condition repeatedly, which duplicates while and fails to match the behavior of most other languages which implement a for loop.

Muaahahahaa!

I like the basic (pun intented) assertions of the reviewer:

  • Kernighan's mistakes
  • infamous fact
  • fails to match the behavior of most other languages

For me, it smells of someone who never succeeded mastering C's basic features/philosphy.

Introduction

As I was still studying physics in university, I found C (i.e. C-like languages) were to become my language of choice when I discovered C's for loop.

So apparently, one's style failure is another's style success. This will end the subjective part of this discussion.

Perhaps Kernigan's for should have been called loop?

Just kidding? Perhaps not.

The author of the review apparently decided that each language construct should have the same behavior across languages. So renaming for as loop would have eased his/her discomfort.

The problem is that the author fails to understand that C's for is an extension of while, not a different loop. This means while is a syntactic sugar light version of for, instead of the while other languages where for may have been castrated down.

Is C for really needed?

Quoting the author of the review, he/she makes the following assertions about for and while:

  • for is for constant loops, from a beginning to an end
  • while is for loops evaluating a condition at each iteration

Having orthogonal features is a good thing when you can combine them. But in all languages I know, there is no way to combine both for and while together.

Example: What if, for the reviewer's language, you need a loop going from a beginning to an end (like a for), but still able to evaluate at each loop iteration (like a while): You could need to find in a subset of a container (not the whole container), a value according to some stateful test?

In a C-like language, you use the for. In the reviewer's ideal language, you just, well, hack some ugly code, crying because you don't have a for_while loop (or C's for loop).

Conclusion

I believe we can summarize the reviewer's critic of C (and C-like languages) with the following statement "Kernigan's infamous mistake of not giving C the syntax of the previous, existing languages", which can be summarized again as "Never think different".

Quoting Bjarne Stroustrup:

There are only two kinds of languages: the ones people complain about and the ones nobody uses.

So, all in all, the reviewer's comment should be considered as a praise.

^_^

paercebal