ansaurus

Question

What opcode dispatch strategies are used in efficient interpreters?

Answer 1

+3 A:

Jason S 2009-02-04 14:28:30

I am intrigued by this "JIT" compilation, but what does it buy me? Will the program begin execution as quickly as an interpreter that does not do JIT? Will it use more memory? Should I JIT the entire program, or only the hot spots? Do I have to do extra work to run on ARM as well as x86?

joeforker 2009-02-04 15:16:57

I may have misinterpreted your question... I don't know anything about how to actually do JIT, just that it's now used a lot in bytecode and other interpreters (Java and Javascript and MATLAB all use it).

Jason S 2009-02-04 15:19:17

I'm looking for the name of the technique, a short explanation, and a reason why you might choose that particular technique. JIT uses more memory, probably takes longer to start up, but it can be super fast. Biggest drawback is that the JIT is platform dependent.

joeforker 2009-02-04 15:34:59

Answer 2

+2 A:

Befor you start anything, check Lua.

It's small (150Kb), pure ANSI C, works on anything having C compiler. Very fast.

And most important - source code is clean and readable. Worth checking out.

dmajkic 2009-02-04 14:36:51

lua is wonderful. I especially enjoyed the paper about their implementation of number- and hash- indexed arrays in the same data structure. Put zero-based arrays in the next one and I'm onboard :-)

joeforker 2009-02-04 15:14:15

Answer 3

A:

Benchmarking is a good technique on making anything fast on given platform(s). Test, refine, test again, improve.

I don't think you can get any better answer. There's lot of techniques to make interpreters. But I give you a tip, do not do trade offs, just choose what you really need and pursue those goals.

Cheery 2009-02-04 14:58:34

Trade-offs are unavoidable. C is fast but you can only run the binary on one platform. Java runs on lots of platforms but needs a big smart runtime.

joeforker 2009-02-04 15:12:28

Answer 4

+1 A:

One big win is to store the source code in an intermediate form, rather than redoing lexical analysis and parsing during execution.

This can range all the way from just storing the tokens, through Forth style threaded code and on to JIT compilation.

Darron 2009-02-04 15:05:15

Answer 5

+1 A:

The question is a bit vague. But, it seems you're asking about writing an interpreter.

Interpreters typically utilize traditional parsing components: lexer, parser, and abstract syntax tree (AST). This allows the designer to read and interpret valid syntax and build a tree structure of commands with associated operators, parameters, etc.

Once in AST form, the entire input is tokenized and the interpreter can begin executing by traversing the tree.

There are many options, but I've recently used ANTLR as a parser generator that can build parsers in various target languages, including C/C++ and C#.

spoulson 2009-02-04 15:14:28

Answer 6

+2 A:

Indirect threading is a strategy where each opcode implementation has its own JMP to the next opcode. The patch to the Python interpreter looks something like this:

add:
    result = a + b;
    goto *opcode_targets[*next_instruction++];

opcode_targets maps the instruction in the language's bytecode to the location in memory of the opcode implementation. This is faster because the processor's branch predictor can make a different prediction for each bytecode, in contrast to a switch statement that has only one branch instruction.

The compiler must support computed goto for this to work, which mostly means gcc.

Direct threading is similar, but in direct threading the array of opcodes is replaced with pointers to the opcode implentations like so:

goto *next_opcode_target++;

These techniques are only useful because modern processors are pipelined and must clear their pipelines (slow) on a mispredicted branch. The processor designers put in branch prediction to avoid having to clear the pipeline as often, but branch prediction only works for branches that are more likely to take a particular path.

joeforker 2009-02-04 16:01:06

true for processors that do branch prediction, including x86, x86_64, and so on.

Cheery 2009-02-04 19:56:10

@Cheery it's not the architecture but the specific processor model that includes branch prediction, but you would be hard pressed to be reading stackoverflow with a processor that does not implement it.

joeforker 2009-02-04 22:15:10

Answer 7

+2 A:

There are a number of papers on different kinds of dispatch:

M. Anton Ertl and David Gregg, Optimizing Indirect Branch Prediction Accuracy in Virtual Machine Interpreters, in Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (PLDI 03), pp. 278-288, San Diego, California, June 2003.

M. Anton Ertl and David Gregg, The behaviour of efficient virtual machine interpreters on modern architectures, in Proceedings of the 7th European Conference on Parallel Computing (Europar 2001), pp. 403-412, LNCS 2150, Manchester, August 2001.

An excellent summary is provided by Yunhe Shi in his PhD thesis.

Also, someone discovered a new technique a few years ago which is valid ANSI C.

Paul Biggar 2009-07-29 16:10:59

ansaurus

tags:

views:

answers:

What opcode dispatch strategies are used in efficient interpreters?

related questions