are compilers in different languages????
views:
207answers:
5Compilers are often written in the language of said compiler. For example, a C compiler is typically written in C.
... which brings up the question "how do you compile it the first time?". To which I've heard two answers: either it is hand compiled (as scary as it sounds), or one cheats and uses an existing compiler for that language.
As @jball has commented, read the Wikipedia article on Bootstrapping for complete details.
What is the language of anything?
gcc for instance is written in c.
These was once a story about a lisp interpreter that is written in lisp.
This brings up the following question - if the C compiler is written in C than what compiled the first C compiler? For this, read here.
a compiler could probably be written in any language. in its most basic form, a compiler merely converts code from one language to another. in the sense that most people use the term "compiler" today, they are referring to something that takes in source code of some higher level language and converts it to either assembly or some low level intermediate language (CIL).
Sometimes yes, sometimes no. It's customary to try and implement the compiler for a new language in that language itself as soon as possible, partially to prove that it can do "heavy lifting".
But of course, you first need a compiler or at least interpreter to run that compiler and have it compile itself - so you first have to implement it in a different language.
And for many specialized languages, writing the compiler in the language itself is not practical because the language is not meant for things like compilers.
Here's a couple of examples:
- the Rubinius Ruby compiler is written in Ruby,
- the YARV Ruby compiler is written in C,
- the XRuby Ruby compiler is written in Java,
- the Ruby.NET Ruby compiler is written in C#,
- the MacRuby Ruby compiler is written in Objective-C,
- the IronJS ECMAScript compiler is written in F#,
- the MS Visual F# compiler is written in F#,
- the MS Visual C# compiler is written in C++, currently being rewritten in C#,
- the MS Visual Basic.NET compiler is written in C++, currently being rewritten in Visual Basic.NET,
- the GCC C compiler is written in C,
- the Clang C compiler is written in C++,
- most Pascal compilers are written in Pascal,
- most Oberon compilers are written in Oberon,
- both the 6g/8g and the gccgo Go compilers are written in C.
In general, compilers can be written in any language that is actually powerful enough to write a compiler in. This obviously includes any Turing-complete language. But it might even be possible to write a compiler in a non-Turing-complete language. (For example, I don't see any obvious reason why a compiler couldn't be a total function, but total functions are obviously not Turing-complete.)
In practice, however, compilers are mostly written in three specific classes of languages with different pros and cons:
- the same language that the compiler implements (pros: larger community, because everybody who knows the language can work on the compiler, otherwise they would need to know both languages; cons: the bootstrap problem)
- the primary low-level systems programming language of the platform the compiler is supposed to run on, e.g. C on Unix, Java on the JVM, C# on the CLI (pros: very fast; cons: oftentimes those languages are simply not very good for writing compilers, also I don't actually believe that the performance benefits are real)
- a language that is very good for writing compilers like ML, Haskell, Lisp, Scheme (pros: those compilers tend to be very easy to understand and hack on; cons: you still need to know both languages)
- special case of the above: a domain-specific language for writing compilers, like OMeta or for the parsing frontend ANTLR, YACC (pros: same as above but even more so; cons: same as above)
All of these are essentially tradeoffs: writing the compiler in the same language makes it easier to understand, because you don't have to learn another language. It can also make it harder to understand because the language isn't actually very good at writing compilers. (Imagine, for example, writing a SQL compiler in SQL.) It might even be impossible to write a compiler, for example (for a pretty loose definition of "language" and "compiler") it is impossible to write a CSS compiler in CSS or an HTML compiler in HTML.
On the opposite side: writing the compiler in a specialized compiler-writing language probably makes it easier to understand, but at the same time it requires you to learn a new language.
Note that the three classes are not disjoint: a compiler can fall into more than one class. For example, a compiler for a specialized compiler-writing language, written in itself falls both into category 1 (written in itself) and 3 (written in a language good at writing compilers).
In some cases, you are actually able to hit the sweet spot. For example, F# is a native language with native speed on the CLI, and it is very good at writing compilers. So, writing the F# compiler in F# gives you #1 (writing in itself), #2 (writing in a native, fast language) and #3 (writing in a language that is good for writing compilers). The same applies to Scala.