views:

442

answers:

4
+3  Q: 

Bytecode Design?

I'm designing a programming language which compiles to an intermediary bytecode. However, I'm having a lot of trouble designing the bytecode structure. Does anybody have any pointers on how to represent a program in binary? Alternatively, are there any resources (preferably free) on how to do this? The closest I've found is the description of the Lua interpreter's bytecode.

EDIT: A bit more information: I'm implementing my own garbage collection scheme which is heavily optimized for immutability and concurrency. For efficiencies' sake I need some unique bytecode instructions that allow programs to interact with the garbage collection scheme.

A: 

You may find it useful to look at the Wikipedia article on Bytecode http://en.wikipedia.org/wiki/Bytecode" and follow some of the references to languages of the age and style you are interested in.

mas
+1  A: 

This article describes the GNU Smalltalk VM and its bytecode. Googling for "smalltalk bytecode" will come up with other resources.

anon
+1  A: 

Don't design your bytecode it is unnecessary!

I would recommend looking into LLVM and GNU Lightning that do alot of the hard-work for you and just demand you create an AST-like schema for translation after you have annotated things out and resolved scope and so on.

The dragon book also includes some sections on bytecode. The Art of Computer Programming might also help as the Mix language states some (dated) but important design decisions.

Really, your intermediate code should be:

  1. Something that you wrote to be an efficient intermediate form that allows for popular optimization algorithms and translation to a backend without loss of semantics through bad translation and the like:
  2. A well known and used IR that you can use other tools to translate to machine code. Even if you use the .NET/Mono setup as IR then if it suites your needs then great.

It is all about your requirements, don't design your own IR/bytecode unless you need to. If something else fits, use it! You don't need to maintain it!

Aiden Bell
Thank you for your input, but the most innovative features of my language are in the builtin data structures and the way they are garbage collected, which unfortunately must be implemented at the VM level.
Imagist
@Imagist ... then maybe a way forward would be to build an bytecode engine that allows external hooking for these things creating a bytecode engine with event-driven internals and delegation of functionality. That would be quite awesome, it would allow people to specify in the BC what GC they would like to use (and things)
Aiden Bell
A: 

You can go over a list of python bytecode instructions, and use the dis module to see what bytecodes are generated for simple programs.

See how-many-places-are-optimized-in-pythons-bytecodeversion-2-5 for discussion of bytecode optimiztaion.

gimel