views:

1005

answers:

6

I'm interested in implementing a Forth system, just so I can get some experience building a simple VM and runtime.

When starting in Forth, one typically learns about the stack and its operators (DROP, DUP, SWAP, etc.) first, so it's natural to think of these as being among the primitive operators. But they're not. Each of them can be broken down into operators that directly manipulate memory and the stack pointers. Later one learns about store (!) and fetch (@) which can be used to implement DUP, SWAP, and so forth (ha!).

So what are the primitive operators? Which ones must be implemented directly in the runtime environment from which all others can be built? I'm not interested in high-performance; I want something that I (and others) can learn from. Operator optimization can come later.

(Yes, I'm aware that I can start with a Turing machine and go from there. That's a bit extreme.)

Edit: What I'm aiming for is akin to bootstrapping an operating system or a new compiler. What do I need do implement, at minimum, so that I can construct the rest of the system out of those primitive building blocks? I won't implement this on bare hardware; as an educational exercise, I'd write my own minimal VM.

+1  A: 

Which Forth implementation are you using that doesn't provide this information in the documentation? Given the nature of Forth, it might be implementation-dependent. There's a standard set of words in the dictionary, but whether they got there by assembly/C/whatever or by Forth shouldn't matter, since Forth is by definition a self-extensible language.

le dorfier
I've looked at pforth, which predefines hundreds of operators in its C source.
Barry Brown
By definition, the instruction set of your processor must be sufficient, so just expose that; because that's what all compilers build from. And even that instruction set is probably not entirely be necessary, because some instructions can be expressed in terms of the others.
le dorfier
+4  A: 

A long time ago, I had a book called "Threaded Interpretive Languages", published I think by Byte, that discussed how to implement a Forth-like language (I don't think they ever called it Forth) in Z80 assembly.

You may not have a Z80 handy, or want one, but the book might be instructive.

David Thornley
John R. Strohm
+1 for that book which I love, it was the book which hooked me on the idea of inventing and studying languages
Andy Dent
+6  A: 

This thread covers your exact question. Here is a soup-to-nuts implementation with complete documentation.

I wrote a subroutine threaded Forth targeting 68K when I was in college. I defined the runtime environment and dictionary format, then wrote some C code that boot strapped a Macintosh application that loaded a default dictionary, populated some I/O vectors and got the code running. Then I took the Leo Brodie book Starting Forth and started implementing the basic dictionary in 68K assembly language. I started with arithmetic/logic words, then did control structures then word definition/manipulation words. My understanding is that at a minimum you need @, !, +, -, * and /. The rest can be implemented in terms of those, but that's like trying to write an entire graphics library based on SetPixel and GetPixel: it will work, but yikes, why?

I enjoyed the process as there were some really puzzles, like getting DOES> exactly right (and once I had a solid DOES> implementation, I was creating closures that turned into tiny, tiny amounts of code).

plinth
+1  A: 

I'm still not convinced the question is well-formed. For example, Plinth's instructions can be reduced; after all, * and / can be implemented in terms of + and -, but then '+' can be implemented in terms of a successor function (see the Peano axioms.) Which puts you into the neighborhood of a Turing machine. How do you know where to stop?

Charlie Martin
I'd stop where it's educationally-instructive to do so. For example, I can certainly see implementing multiplication in terms of addition, but implementing addition as a primitive. Then we have incentive to weigh the pros and cons of implementing '*' and others as primitives.
Barry Brown
That's a different question, at least to a math geek. "Primitive" means "minimal, least, atomic" in this context.
Charlie Martin
+5  A: 

This post at comp.lang.forth lists a few "minimal Forths".

http://groups.google.com/group/comp.lang.forth/msg/10872cb68edcb526

Why do I know this? My brother, Mikael, wrote #3 and he also wrote a paper about making a "minimal Forth" (in Swedish, though). If I remember correctly he wanted to get a minimal set of operators that could be built in silicon.

epatel
+3  A: 

You might also want to take a look at Hans Bezemer's 4tH compiler.

boost