views:

223

answers:

3

I've been working on Issue 14 on the PascalScript scripting engine, in which using a Goto command to jump out of a Case block produces a compiler error, even though this is perfectly valid (if ugly) Object Pascal code.

Turns out the ProcessCase routine in the compiler calls HasInvalidJumps, which scans for any Gotos that lead outside of the Case block, and gives a compiler error if it finds one. If I comment that check out, it compiles just fine, but ends up crashing at runtime. A disassembly of the bytecode shows why. I've annotated it with the original script code:

[TYPES]
<SNIPPED>
[VARS]
Var [0]: 27 Class TFORM
Var [1]: 28 Class TAPPLICATION
Var [2]: 11 S32 //i: integer
[PROCS]
Proc [0] Export: !MAIN -1
{begin}
 [0] ASSIGN GlobalVar[2], [1]
{ i := 1;}
 [15] PUSHTYPE 11(S32) // 1
 [20] ASSIGN Base[1], GlobalVar[2]
{ case i of}
 [31] PUSHTYPE 25(U8) // 2
{   0:}
 [36] COMPARE into Base[2]: [0] = Base[1]
 [57] COND_NOT_GOTO currpos + 5 Base[2] [72]
{   end;}
 [67] GOTO currpos + 41 [113]
{   1:}
 [72] COMPARE into Base[2]: [1] = Base[1]
 [93] COND_NOT_GOTO currpos + 10 Base[2] [113]
{     goto L1;}
 [103] GOTO currpos + 8 [116]
{   end;}
 [108] GOTO currpos + 0 [113]
{ end; //<-- case}
 [113] POP // 1
 [114] POP // 0
{ Exit;}
 [115] RET
{L1:
 Writeln('Label L1');}
 [116] PUSHTYPE 17(WideString) // 1
 [121] ASSIGN Base[1], ['????????']
 [144] CALL 1
{end.}
 [149] POP // 0
 [150] RET
Proc [1]: External Decl: \00\00 WRITELN

The "goto L1;" statement at 103 skips the cleanup pops at 113 and 114, which leaves the stack in an invalid state.

Delphi doesn't have any trouble with this, because it doesn't use a calculation stack. PascalScript, though, is not as fortunate. I need some way to make this work, as this pattern is very common in some legacy scripts from a much simpler system with little in the way of control structures that I've translated to PascalScript and need to be able to support.

Anyone have any ideas how to patch the codegen so it'll clean up the stack properly?

+1  A: 

The straightforward solution would be:

When generating a GOTO for goto statement, prefix the GOTO with the same cleanup code that comes before RET.

Martin Konicek
While unwinding the stack may work here, I'm not certain it will work for all cases.
skamradt
And tomorrow you get a goto that exits two nested case statements, but then does have some code after the label and you can start over
Marco van de Voort
+3  A: 

IIRC the goto rules in classic pascals were:

  • jumps are only allowed out of a block (iow from a higher to a lower nesting level on the "same" branch of the tree)
  • from local procedures to their parents.

The later was afaik never supported by Borland derived Pascals, but the first still holds.

So you need to generate exiting code like Martin says, but possibly it can be for multiple block levels, so you can't have a could codegeneration for each goto, but must generate code (to exit the precise number of needed blocks).

A typical test pattern is to exit from multiple nested ifs (possibly within a loop) using a goto, since that was a classic microoptimization that was faster at least up to D7.

Keep in mind that the if evaluation(s) and the begin..end blocks of their branches might have generated temps that need cleanup.

---------- added later

I think the codegenerator needs a way to walk the scopes between the goto and its endpoint, generating the relevant exit code for blocks along the way. That way a fix works for the general case and not just this example. Since you can only jump out of scopes, and not into it that might not that be that hard.

IOW generate something that is equivalent to (for a hypothetical double case block)

Lgoto1gluecode: // exit code first block pop x pop y // exit code first block pop A pop B goto real_goto_destination

Additional analysis can be done. E.g. if there is only one scope, and it has already a cleanup exit label, you can jump directly. If you know for certain that the above pop's are only discarded values (and not saves of registers) you can do them at once with add $16,%esp (4*4 byte values) etc.

Marco van de Voort
Yeah, out of a block to a higher branch on the same procedure is what I'm trying to accomplish here.
Mason Wheeler
+1  A: 

It looks to me like the calculation of how far to jump forward is the problem. I would have to spend some time looking at the implementation of the parser to help further, but my guess would be that additional handling must be performed when using a goto and there are values on the stack AND the goto would be placed after those values would be removed from the stack. Of course to determine this you would need to save the current location being parsed (the goto) and the forward parse to the target location watching for stack changes, and if so then to either adjust the goto location backwards, or inject the code as Martin suggested.

skamradt