views:

941

answers:

6

I remember a professor once saying that interpreted code was about 10 times slower than compiled. What's the speed difference between interpreted and bytecode? (assuming that the bytecode isn't JIT compiled)

I ask because some folks have been kicking around the idea of compiling vim script into bytecode and I just wonder what kind of performance boost that will get.

+4  A: 

When you compile things down to bytecode, you have the opportunity to first perform a bunch of expensive high-level optimizations. You design the byte-code to be very easily compiled to machine code and run all the optimizations and flow analysis ahead of time.

The speed-increase is thus potentially quite substantial - not only do you skip the whole lexing/parsing stages at runtime, but you also have more opportunity to apply optimizations and generate better machine code.

Eclipse
Based on what Java (the most widespread bytecode-based platform) does, you're completely wrong. There is very little optimization done during source code compilation; the large majority of optimization is done by the JIT compiler, and the JVM is a stack-based architecture which does not map very well at all to real-world instruction sets.
Michael Borgwardt
He's not asking about Java, he's asking about bytecode in general. Going from an interpretter to a bytecode / JIT provides the opportunity to do early optimizations and to select a bytecode that maps well to machine language. Java's designers had reasons for choosing the bytecode that they did, but that's not the only way to do it.
Eclipse
+3  A: 

You could see a pretty good boost. However, there are a lot of factors. You can't just say that compiled code is always about 10 times faster than interpreted code, or that bytecode is n times faster than interpreted code.

Factors include the complexity and verbosity of the language for example. If a keyword in the language is several characters, and the bytecode is one, it should be quite a bit faster to load the bytecode, and jump to the routine that handles that bytecode, than it is to read the keyword string, then figure out where to go. But, if you're interpreting one of the exotic languages that has a one-byte keyword, the difference might be less noticeable.

I've seen this performance boost in practice, so it might worth it for you. Besides, it's fun to write such a thing, gives you a feel for how language interpreters and compilers work, and that will make you a better coder.

Don Branson
Where would one begin? I wrote a compiler in college but it was for a really simple language and I can't even imagine writing a byte code interpreter.
Whaledawg
Most of the difficult work is going to be compiling to the byte-code in the first place - after that your interpreter is going to be little more than a glorified state machine. Just use the bytecode directly to index into an arrays of routines. It will only get difficult if you want to JIT it.
Eclipse
I can't really give you current advice on this - I wrote a compiler that compiled to bytecode in the 80s sometime, and that was in 6502 assembler on an Ohio Scientific C1P. So, things may have changed sine then ;)
Don Branson
A: 

It is according to your virtual machine. Some of your faster virtual machines(JVM) are approaching the speed of C code. So how fast is your interpreted code running compared to C?

Don't think that if you convert your interpreted code into ByteCode it will run as fast a Java(near C speeds), there has been years of performance boosting going on, but you should see significant speed boost.

Emacs has been ported into bytecode with increased performance. Might be worth a look to you.

WolfmanDragon
A: 

I've never noticed a Vim script that was slow enough to notice. Assuming a script primarily calls built-in, native-code, operations (regexes, block operations, etc) that are implemented in the editor's core, even a 10x speed-up of the 'glue logic' in scripting would be insignificant.

Still, profiling is the only way to be really sure.

Sean McSomething
A: 

Are there actually any mainstream "interpreters" these days that don't actually compile their code? (Either to bytecode or something similar.)

For instance, when you use use a Perl program directly from its source code, the first thing it does is compile the source into a syntax tree, which it then optimizes and uses to execute the program. In normal situations the time spent compiling is tiny compared to the time actually running the program.

Sticking to this example, obviously Perl cannot be faster than well-optimized C code, as it is written in C. In practice, for most things you would normally do with Perl (like text processing), it will be as fast as you could reasonably code it in C, and orders of magnitude easier to write. On the other hand, I certainly wouldn't try to write a high performance math routine directly in Perl.

Sol
+1  A: 

Also, a lot of "classic" interpreters also include the lex/parse phase along with execution.

For example, consider executing a Python script. When you do that, you have all the costs associated with converting the program text in to the internal interpreter data structures, which are then executed.

Now contrast that with executing a compiled Python script, a .pyc file. Here, the lex and parse phase is done, and you have just the runtime of the inner interpreter.

But if you consider, say, a classic BASIC interpreter, these typically never store the raw text, rather they store a tokenized form and recreate the program text when you do "LIST". Here the byte code is much cruder (you don't really have a virtual machine here), but your execution gets to skip some of the text processing. That's all done when you enter the line and hit ENTER.

Will Hartung