views:

171

answers:

5

A couple of years ago I started writing an interpreter for a little Domain Specific Language which included programmer-defined functions.

At first I implemented variable scope using a simple stack of symbol-tables. But now I want to move to proper lexical scoping (with the option of closures). Can anyone explain or point me at a good explanation of the data-structure and algorithm behind lexical scope?

+2  A: 

There is no single right way to do this. The important thing is to clearly state the semantics that you are looking to provide, and then the data structures and algorithms will follow.

bmargulies
Sure. I can always try to derive the whole thing myself. :-) But for many well understood programming tasks, there are usually existing solutions that are already known and widely taught and adopted, no?
interstar
The book referenced in the comment to your question, or the famous book with the dragon on the cover, will take care of that.
bmargulies
+1  A: 

To get correct lexical scoping and closures in an interpreter, all you need to do is follow these rules:

  • In your interpreter, variables are always looked up in an environment table passed in by the caller/kept as a variable, not some global env-stack. That is, eval(expression, env) => value.
  • When interpreted code calls a function, the environment is NOT passed to that function. apply(function, arguments) => value.
  • When an interpreted function is called, the environment its body is evaluated in is the environment in which the function definition was made, and has nothing whatsoever to do with the caller. So if you have a local function, then it is a closure, that is, a data structure containing fields {function definition, env-at-definition-time}.

To expand on that last bit in Python-ish syntax:

x = 1
return lambda y: x + y

gets executed as if it were

x = 1
return makeClosure(<AST for "lambda y: x + y">, {"x": x})

where the second dict argument may be just the current-env rather than a data structure constructed at that time. (On the other hand, retaining the entire env rather than just the closed-over variables can cause memory leaks.)

Kevin Reid
+1  A: 

Read The implementation of Lua 5.0 for instance.

lhf
+1  A: 

There are many different ways to implement lexical scoping. Here are some of my favorites:

  • If you don't need super-fast performance, use a purely functional data structure to implement your symbol tables, and represent a nested function by a pair containing a pointer to the code and a pointer to the symbol table.

  • If you need native-code speeds, my favorite technique is described in Making a Fast Curry by Simon Marlow and Simon Peyton Jones.

  • If you need native-code speeds, but curried functions are not that important, consider closure-passing style.

Norman Ramsey
A: 

Stroustrup implemented this in the first C++ compiler simply with one symbol table per scope, and a chaining rule that followed scopes outwards until a definition is found. How this works exactly depends on your precise semantics. Make sure you nail those down first.

EJP