views:

255

answers:

4

Ok guys I thought I'd take my old CS notes and look through compiler theory a little more. I have to say I can't for the life in me remember how all this stuff works but I do have a nice sample application from my college days that helps me understand a few things.

This sample application takes the made up language and compiles it down to an intermediate assembly code like language. There is then a simple VM implementation that takes this intermediate language and executes the statements.

The thing I cant get my head around is, if I this were a straight up interpretor and not a compiler would it still be building up these intermeditary commands in memory to be executed at the end. Or does an interpretor actually "execute" descreet sections of the code chunks at a time?

+4  A: 

It depends on the language. Most modern interpreted languages (Perl, Python, and Ruby, to name a few) precompile the source code into some intermediary form to be executed at the end (citation).

Bill the Lizard
Is there an alternative? My goal is to write a very, VERY simple language parser and the thought of having to compile do to IL seems overkill somehow.
Owen
The problem is that you need *something* to execute your parsed commands. This is what perl, python, and java do for you. The only alternatives I know are to compile to machine code or target an existing "interpreter" (like the JVM).
Bill the Lizard
Not sure I follow you. What I mean is that the likes of the Java VM interpritate Javas IL. But if your core language is already very simple then why would an IL even be needed?
Owen
My fault. There's no absolute need for the IL. You could, for example, parse and execute your source one line at a time if you wanted. I wrote a simple calculator back in college that worked like this.
Bill the Lizard
You may not want to parse and execute at the same time. This can cause problems if you call a function that hasn't yet been defined, but is defined later on in a file. It might be best to translate it into some intermediate form, even internally, just so you get the full symbol table. Or something.
Chris Lutz
@Chris: Right. That all depends on the syntax rules of the language, though. In many older languages that would just be a syntax error.
Bill the Lizard
+1  A: 

Parsers don't compile. There are actually quite a few steps involved when translating a program (from say a high level language like C++ to machine code). It depends on the design if it executes in one go or after multiple passes over the input. Can you make your question a bit more specific? Meanwhile, however much you hate it, have a look here -- specially the frontend and backend sections.

dirkgently
Thanks for your help, do you know of any simple C#/Java interpretor that I could take a look at?
Owen
You have chosen some very complex languages. Start with something simple. May I suggest a stack based language like PostScript or Factor?
dirkgently
At the very least, a desk calculator.
dirkgently
Lol, no I did not mean an interpretor of C#/Java. I meant one written in these language. Yes, I would imagine that this is a little above me right now :-).
Owen
I don't use either of these much. But there are a host of others around to help you along. This looks a good start -- http://www.codeproject.com/KB/cs/csi.aspx
dirkgently
Thanks for the help, also found this one that looks interesting: http://www.codeproject.com/KB/cs/csi.aspx
Owen
If you want to find simple interpreters, you should try brainf*ck, but that may be too simple for practical learning applications. Look into lolcode (http://lolcode.com). It's a silly joke language, but there's a .NET compiler and a number of interpreters, and it has fairly complicated syntax.
Chris Lutz
+1  A: 

Most modern interpreters parse program to intermediate code which later is interpreted. Some store this intermediate code explicitly (eg. Python's .pyc). There are exceptions for example, shell scripts are directly interpreted, rather then parsed to intermediate format.

Some more advanced "interpreters" actually do not interpret, but do JIT (just-in-time) compiling (eg. Java or .NET).

vartec
+3  A: 

I have written or worked with interpreters that act directly upon parsing tokens in the input, that act directly on an abstract-syntax tree (AST) built by the parser, and that translate the AST to a form designed for efficient execution. So the answer is, it depends.

  • If your target machine has 8K of RAM, a direct-parsing interpreter would be a sensible choice (think FORTH).
  • If you are using interpreters to teach or learn about the structure and semantics of programming languages, building and interpreting an AST is a good choice.
  • If you are using an interpreter for portability and you want fast execution, compiling to a register-based virtual machine is a good choice. (David Gregg and others have shown there is less interpretive overhead in a register-based VM like the Lua VM than in a stack-based VM like the Java VM.)
Norman Ramsey
Might I add that Parrot seems to be kind of popular and new and shiny, and is also a register-based VM? A good idea for a VM to target maybe.
Chris Lutz