views:

906

answers:

10

I understand how a language can bootstrap itself, but I haven't been able to find much reference on why you should consider bootstrapping.

The intuitive answer is that the language you're writing offers utilities that are not found in the "base" language of the compiler, and the language's features are relatively well-suited for a compiler.

For instance, it would make sense to bootstrap a C++ compiler -- it could potentially be much easier to maintain the compiler when OOP is properly used, as opposed to using plain C.

On the other hand, MATLAB certainly makes matrix math a lot easier than plain C, but I can't see any apparent benefits from writing a MATLAB compiler/interpreter in MATLAB -- it seems like it would become less maintainable. A similar view could be applied to the R programming language. Or a pretty extreme example would be bootstrapping Whitespace, which is written in Haskell -- definitely a massive superset of Whitespace.

Is the only reason for bootstrapping to take advantage of the new language's features? I know there's also the "because we can" reason, but that's not what I'm looking for :)

+14  A: 

There's a principle called "eating your own dogfood". By using a tool, you demonstrate the usefulness of the tool.

It is often asked, "if the compiler for language X isn't written in language X, why should I risk using it?"

This of course only applies to languages suitable for the domain of compiler writing.

Daniel Earwicker
+4  A: 

It can be considered the bar that separates "toy" languages from "real" languages. If the language isn't rich enough to implement itself, it's still a toy. But this is probably an attitude from a largely bygone era, given the number of popular languages today that are implemented in C.

Peter Seibel
I disagree with the categorization of "toy" vs "real". Languages serve a purpose. Some are general purpose languages like C, VB, etc which can be used to invent themselves. Others, like MATLAB, are specialized for the purpose of solving specific classes of problems. This doesn't make them "toy" languages, just specialized.
Chris Lively
+4  A: 

One advantage would be that developers working on the compiler would only need to know the language being compiled. Otherwise developers would need to know the language being compiled as well as the language the compiler is written in.

sepp2k
That's not a very big deal though I don't think. Typically, a compiler writer should know many, many languages (or really, any computer scientist).
BobbyShaftoe
You might be surprised. I speak several languages, but when I had to try maintaining a system that was documented in the two I know best (English and German), I found it to be at least 10 times as difficult as maintaining one entirely in English. Little impedance mismatches really add up.
Ken
+6  A: 

There are two main advantages to bootstrapped language implementations: first, as you suggest, to take advantages of the high-level features of said language in the implementation. However, a less-obvious but no less important advantage is that it lets you customize and extend the language without dropping into a lower layer written in C (or Java, or whatever sits below the new language runtime).

Metaprogramming may not be useful for most day-to-day tasks, but there are times where it can save you a lot of duplicated or boilerplate code. Being able to hook into the compiler and core runtime for a language at a high level can make advanced metaprogramming tasks much easier.

rcoder
+1  A: 

You don't bootstrap a compiler for DSL. You don't write an SQL query compiler in SQL. MATLAB might look like a general purpose language, but actually it isn't -- it is a language designed for numerical calculations.

liori
+2  A: 

Low-level languages are often bootstrapped because in order to put code on the new system, you need a low-level compiler there. Get a C compiler over and now you have tons of code available to use. Having a bootstrapped compiler makes this easier, you only require the presence of your own code in order to compile and improve your own code.

There are other ways to accomplish this, like making a cross-compiler, on most systems you never need to be able to compile static languages on the device itself in ordinary use (in fact, systems like Windows ship with no compiler).

Another reason compilers often bootstrap is so that they don't have to worry about bugs in the compiler they are compiled with. Ensure your compiler can be compiled with itself and you limit the combinations of bugs that might otherwise appear if you compile with another compiler.

I think bootstrapping high-level languages is mostly done to show off one's hairy-chested programming skills.

Southern Hospitality
+1  A: 

Bootstrapping has also another advantage: If your language is nice, you might save time by writing your compiler in <insert language here> than in let's say C. For instance, the C# compiler was written in C++, but now they are rewriting it in C#, which allows them (among other things) to use the threading framework from the CLR instead of rolling their own in C++ (and to follow the lead of the Mono guys as well, marketing wise, Mono was in a better position by being able to say our C# compiler is actually written in C#).

Anteru
A: 

There are a couple of reasons you might want to do it (in theory):

  1. Your compiler generates more optimized code than other compilers on the bootstrap platform.
  2. Your compiler generates more correct code than the other compilers on the bootstrap platform.
  3. You're an egotistical jerk who is convinced that one of the above is true even though it's not.
  4. There isn't a compiler available on your platform (this was GCC's original logic, because many platforms didn't have a C compiler back in the day).
  5. You want to prove that your compiler can handle it (this is, after all, actually a pretty good test of a compiler).
Russell Newquist
A: 

Compilers solve a wide variety of non-trivial problems including string manipulation, handling large data structures, and interfacing with the operating system. If your language is intended to handle those things, then writing your compiler in your language demonstrates those capabilities. Additionally, it creates an exponential effect because as your language includes more features, you have more features you can use in your compiler. If you implement any unique features that would make compiler-writing easier, you have those new tools available to implement even more features.

However, if your language is not intended to handle the same problems as compilation, then bootstrapping will only tempt you to clutter your language with features which are related to compilation but not to your target problem. Self-compilation with Matlab or SQL would be ridiculous; Matlab has no reason to include strong string manipulation functions and SQL has no reason to support code generation. The resulting language would be unnecessary and cluttered.

It's also worth noting that interpreted languages are a slightly different problem and should be treated accordingly.

Imagist