tags:

views:

182

answers:

2

In the spirit of this question, I'd like to ask a similar question, but about compilers, not about interpreters.

What is the conceptually smallest compiler that can compile its own code?

When I say "conceptually smallest" I mean that it uses only very basic concepts and builds up from there, rather than that it contains very short code. An example of why this is an important distinction is OTCC a very tiny C compiler which is small because it's obfuscated, not necessarily because it is conceptually simple (it may also be conceptually simple, but I don't know; it's obfuscated).

I would also like to add that the following also might be a very conceptually small program, but it doesn't actually tell us anything about what's going on, so it's not really what I'm looking for either:

(writefile argv[2] (generate (parse (readfile argv[1]))))

What I'm really looking for is a language that is:

  1. Turing complete.
  2. Capable of compiling itself.

I'm interested in this because

  1. it would be an interesting case study and
  2. it could be useful as a starting point for bootstrapping compilers.

If it doesn't exist, I may just write it myself. :)

+1  A: 

I'm not really clear what you mean by 'conceptually smallest'. Presumably you're not interested in minimal Turing machines or representations in Lambda calculus? If you're talking about physical compiler implementations, then you're really talking about a compiler that generates machine code instructions. TCC, as mentioned by Anthony Mills' comment, is relevant. Another interesting discussion that should have practical application is this detailed description of a bootstrapping compiler written from scratch.

There was an interesting discussion a while back on the comp.compilers newsgroup that's worth checking out.

ire_and_curses
+1  A: 

You don't say what the target machine is or whether the compiler has to exist or just be imagined.

In the world of the imagination, I would say that an adaptation of John McCarthy's metacircular LISP interpreter would come pretty close. You might also want to look at John Reynold's paper Definitional Interpreters for Higher-Order Languages which although dense is a model of simplicity.

In the world of reality, I'd bet on Chez Scheme, but unfortunately the native-code compiler is proprietary and closed source. Still, you can learn from studying the interpreter. Another system worth studying is the Oberon compiler, which was designed to be built and understood by one person, and it is very clean.

Norman Ramsey