views:

373

answers:

7

Is it because Pascal was designed to be so, or are there any tradeoffs?

Or what are the pros and cons to forbid or not forbid modification of the counter inside a for-block? IMHO, there is little use to modify the counter inside a for-block.

EDIT:
Could you provide one example where we need to modify the counter inside the for-block?

It is hard to choose between wallyk's answer and cartoonfox's answer,since both answer are so nice.Cartoonfox analysis the problem from language aspect,while wallyk analysis the problem from the history and the real-world aspect.Anyway,thanks for all of your answers and I'd like to give my special thanks to wallyk.

+1  A: 

From For loop

In some languages (not C or C++) the loop variable is immutable within the scope of the loop body, with any attempt to modify its value being regarded as a semantic error. Such modifications are sometimes a consequence of a programmer error, which can be very difficult to identify once made. However only overt changes are likely to be detected by the compiler. Situations where the address of the loop variable is passed as an argument to a subroutine make it very difficult to check, because the routine's behaviour is in general unknowable to the compiler.

So this seems to be to help you not burn your hand later on.

astander
The biggest problem I have with that is that at least when I did Pascal/Delphi stuff this *wasn't* documented in the help. Funnily enough, I think it worked in Turbo Pascal 6 but no longer on Delphi.
Joey
Yes, loopvar is not immutable in Delphi (using it make it a more costly while). Loopvar is still undefined after afaik
Marco van de Voort
+3  A: 

It can make some optimizations (loop unrolling for instance) easier: no need for complicated static analysis to determine if the loop behavior is predictable or not.

dmckee
+1  A: 

Disclaimer: It has been decades since I last did PASCAL, so my syntax may not be exactly correct.

You have to remember that PASCAL is Nicklaus Wirth's child, and Wirth cared very strongly about reliability and understandability when he designed PASCAL (and all of its successors).

Consider the following code fragment:

FOR I := 1 TO 42 (* THE UNIVERSAL ANSWER *) DO FOO(I);

Without looking at procedure FOO, answer these questions: Does this loop ever end? How do you know? How many times is procedure FOO called in the loop? How do you know?

PASCAL forbids modifying the index variable in the loop body so that it is POSSIBLE to know the answers to those questions, and know that the answers won't change when and if procedure FOO changes.

John R. Strohm
Of course, giving for Pascal semantics does not ensure that the body of the for loop will terminate...
Charles Stewart
+7  A: 

Pascal was originally designed as a teaching language to encourage block-structured programming. Kernighan (the K of K&R) wrote an (understandably biased) essay on Pascal's limitations, Why Pascal is Not My Favorite Programming Language.

The prohibition on modifying what Pascal calls the control variable of a for loop, combined with the lack of a break statement means that it is possible to know how many times the loop body is executed without studying its contents.

Without a break statement, and not being able to use the control variable after the loop terminates is more of a restriction than not being able to modify the control variable inside the loop as it prevents some string and array processing algorithms from being written in the "obvious" way.

These and other difference between Pascal and C reflect the different philosophies with which they were first designed: Pascal to enforce a concept of "correct" design, C to permit more or less anything, no matter how dangerous.

(Note: Delphi does have a Break statement however, as well as Continue, and Exit which is like return in C.)

Clearly we never need to be able to modify the control variable in a for loop, because we can always rewrite using a while loop. An example in C where such behaviour is used can be found in K&R section 7.3, where a simple version of printf() is introduced. The code that handles '%' sequences within a format string fmt is:

for (p = fmt; *p; p++) {
    if (*p != '%') {
        putchar(*p);
        continue;
    }
    switch (*++p) {
    case 'd':
        /* handle integers */
        break;
    case 'f':
        /* handle floats */
        break;
    case 's':
        /* handle strings */
        break;
    default:
        putchar(*p);
        break;
    }
}

Although this uses a pointer as the loop variable, it could equally have been written with an integer index into the string:

for (i = 0; i < strlen(fmt); i++) {
    if (fmt[i] != '%') {
        putchar(fmt[i]);
        continue;
    }
    switch (fmt[++i]) {
    case 'd':
        /* handle integers */
        break;
    case 'f':
        /* handle floats */
        break;
    case 's':
        /* handle strings */
        break;
    default:
        putchar(fmt[i]);
        break;
    }
}
P-Nuts
'Turbo pascal` support `break` from a long time ago.
Jichao
Yes, but Turbo Pascal supported `break` as a non-standard language extension: it is not part of standard Pascal (ISO 7185). I mentioned the extensions in the context of Delphi as Turbo Pascal is now only of historical interest (it may have been the first compiled language I ever used).
P-Nuts
Yes.I meantioned `Turbo pascal` only to demonstrate that `Why Pascal is Not My Favorite Programming Language` is somehow obselete.
Jichao
Slightly unfair to dismiss Turbo Pascal as only of "historical interest" when in practical terms Delphi is Turbo Pascal (-: There's been a lot of evolution and growth since I got my (first) copy in ~1983 but essentially the one has grown (via the addition of objects, erm, some while ago) into the other.
Murph
+1  A: 

Hi

It's probably safe to conclude that Pascal was designed to prevent modification of a for loop index inside the loop. It's worth noting that Pascal is by no means the only language which prevents programmers doing this, Fortran is another example.

There are two compelling reasons for designing a language that way:

  1. Programs, specifically the for loops in them, are easier to understand and therefore easier to write and to modify and to verify.
  2. Loops are easier to optimise if the compiler knows that the trip count through a loop is established before entry to the loop and invariant thereafter.

For many algorithms this behaviour is the required behaviour; updating all the elements in an array for example. If memory serves Pascal also provides do-while loops and repeat-until loops. Most, I guess, algorithms which are implemented in C-style languages with modifications to the loop index variable or breaks out of the loop could just as easily be implemented with these alternative forms of loop.

I've scratched my head and failed to find a compelling reason for allowing the modification of a loop index variable inside the loop, but then I've always regarded doing so as bad design, and the selection of the right loop construct as an element of good design.

Regards

Mark

High Performance Mark
+11  A: 

In programming language theory (and in computability theory) WHILE and FOR loops have different theoretical properties:

  • a WHILE loop may never terminate (the expression could just be TRUE)
  • the finite number of times a FOR loop is to execute is supposed to be known before it starts executing. You're supposed to know that FOR loops always terminate.

The FOR loop present in C doesn't technically count as a FOR loop because you don't necessarily know how many times the loop will iterate before executing it. (i.e. you can hack the loop counter to run forever)

The class of problems you can solve with WHILE loops is strictly more powerful than those you could have solved with the strict FOR loop found in Pascal.

Pascal is designed this way so that students have two different loop constructs with different computational properties. (If you implemented FOR the C-way, the FOR loop would just be an alternative syntax for while...)

In strictly theoretical terms, you shouldn't ever need to modify the counter within a for loop. If you could get away with it, you'd just have an alternative syntax for a WHILE loop.

You can find out more about "while loop computability" and "for loop computability" in these CS lecture notes: http://www-compsci.swan.ac.uk/~csjvt/JVTTeaching/TPL.html

Another such property btw is that the loopvariable is undefined after the for loop. This also makes optimization easier

cartoonfox
This is the best answer so far.
MaD70
"The for statement is used when the number of iterations in known beforehand" - Wirth 1973, The Programming Language Pascal (Revised Report) http://maben.homeip.net/static/S100/software/pascal/1973%20The%20Programming%20Language%20Pascal.pdf
Charles Stewart
+6  A: 

Pascal was first implemented for the CDC Cyber—a 1960s and 1970s mainframe—which like many CPUs today, had excellent sequential instruction execution performance, but also a significant performance penalty for branches. This and other characteristics of the Cyber architecture probably heavily influenced Pascal's design of for loops.

The Short Answer is that allowing assignment of a loop variable would require extra guard code and messed up optimization for loop variables which could ordinarily be handled well in 18-bit index registers. In those days, software performance was highly valued due to the expense of the hardware and inability to speed it up any other way.

Long Answer

The Control Data Corporation 6600 family, which includes the Cyber, is a RISC architecture using 60-bit central memory words referenced by 18-bit addresses. Some models had an (expensive, therefore uncommon) option, the Compare-Move Unit (CMU), for directly addressing 6-bit character fields, but otherwise there was no support for "bytes" of any sort. Since the CMU could not be counted on in general, most Cyber code was generated for its absence. Ten characters per word was the usual data format until support for lowercase characters gave way to a tentative 12-bit character representation.

Instructions are 15 bits or 30 bits long, except for the CMU instructions being effectively 60 bits long. So up to 4 instructions packed into each word, or two 30 bit, or a pair of 15 bit and one 30 bit. 30 bit instructions cannot span words. Since branch destinations may only reference words, jump targets are word-aligned.

The architecture has no stack. In fact, the procedure call instruction RJ is intrinsically non-re-entrant. RJ modifies the first word of the called procedure by writing a jump to the next instruction after where the RJ instruction is. Called procedures return to the caller by jumping to their beginning, which is reserved for return linkage. Procedures begin at the second word. To implement recursion, most compilers made use of a helper function.

The register file has eight instances each of three kinds of register, A0..A7 for address manipulation, B0..B7 for indexing, and X0..X7 for general arithmetic. A and B registers are 18 bits; X registers are 60 bits. Setting A1 through A5 has the side effect of loading the corresponding X1 through X5 register with the contents of the loaded address. Setting A6 or A7 writes the corresponding X6 or X7 contents to the address loaded into the A register. A0 and X0 are not connected. The B registers can be used in virtually every instruction as a value to add or subtract from any other A, B, or X register. Hence they are great for small counters.

For efficient code, a B register is used for loop variables since direct comparison instructions can be used on them (B2 < 100, etc.); comparisons with X registers are limited to relations to zero, so comparing an X register to 100, say, requires subtracting 100 and testing the result for less than zero, etc. If an assignment to the loop variable were allowed, a 60-bit value would have to be range-checked before assignment to the B register. This is a real hassle. Herr Wirth probably figured that both the hassle and the inefficiency wasn't worth the utility--the programmer can always use a while or repeat...until loop in that situation.

Additional weirdness

Several unique-to-Pascal language features relate directly to aspects of the Cyber:

  • the pack keyword: either a single "character" consumes a 60-bit word, or it is packed ten characters per word.
  • the (unusual) alfa type: packed array [1..10] of char
  • intrinsic procedures pack() and unpack() to deal with packed characters. These perform no transformation on modern architectures, only type conversion.
  • the weirdness of text files vs. file of char
  • no explicit newline character. Record management was explicitly invoked with writeln
  • While set of char was very useful on CDCs, it was unsupported on many subsequent 8 bit machines due to its excess memory use (32-byte variables/constants for 8-bit ASCII). In contrast, a single Cyber word could manage the native 62-character set by omitting newline and something else.
  • full expression evaluation (versus shortcuts). These were implemented not by jumping and setting one or zero (as most code generators do today), but by using CPU instructions implementing Boolean arithmetic.
wallyk
+1 for the history aspect.
Jichao
Is that you Mel?
ergosys
Thanks, but Mel predates me by a few years. (It was hard to resist describing the PPUs—peripheral processing units.)
wallyk