views:

76

answers:

1

I've built an interpreter in C++ and everything works fine so far, but now I'm getting stuck with the design of the import/include/however you want to call it function.

I thought about the following:

  • Handling includes in the tokenizing process: When there is an include found in the code, the tokenizing function is recursively called with the filename specified. The tokenized code of the included file is then added to the prior position of the include. Disadvantages: No conditional includes(!)

  • Handling includes during the interpreting process: I don't know how. All I know is that PHP must do it this way as conditional includes are possible.

Now my questions:

  • What should I do about includes?
  • How do modern interpreters (Python/Ruby) handle this? Do they allow conditional includes?
+3  A: 

This problem is easy to solve if you have a clean design and you know what you're doing. Otherwise it can be very hard. I have written at least 6 interpreters that all have this feature, and it's fairly straightforward.

  1. Your interpreter needs to maintain an environment that knows about all the global variables, functions, types and so on that have been defined. You might feel more comfortable calling this the "symbol table".

  2. You need to define an internal function that reads a file and updates the environment. Depending on your language design, you might or might not do some evaluation the moment you read things in. My interpreters are very dynamic and evaluate each definition as soon as it is read in.

  3. Your life will be infinitely easier if you structure your interpreter in layers:

    • Tokenizer (breaks input into tokens)
    • Parser (reads one token at a time, converts to abstract-syntax tree)
    • Evaluator (reads the abstract syntax and updates the environment)

The abstract-syntax tree is really the key. If you have this, when you encounter the import/include construct in the input, you just make a recursive call and get more abstract syntax back. You can do this in the parser or the evaluator. If you want conditional import, you have to do it in the evaluator, since only the evaluator can compute a condition.

Source code for my interpreters is on the web. Two of them are written in C; the others are written in Standard ML.

Norman Ramsey
Are you going to put the lecture notes back online? And I read somewhere else that the accompanying book was supposed to publish several years ago. How's that coming along?
Wei Hu
@Wei: I changed jobs, which has set the book back a few years. At present I'm rewriting some of the software and am working on a new version of chapter 8 as well as extra material for chapter 7. I'll be teaching PL again next spring and hope to send a draft to publishers around them.
Norman Ramsey
@Norman thanks! Is it possible to read your draft, or is it only available to a limited number of course instructors?
Wei Hu
@Wei anybody interested should send me email [email protected]
Norman Ramsey