Generally the first version of the compiler is written in a different language, and then each subsequent version is written in that language and compiled with the older version. Once you've compiled version x
with version x-1
, you can use the newly built version x
to recompile itself, taking advantage of any new optimizations that version introduces; GCC does its releases that way
It is. You usually need a bootstrap version of the language either compiled or interpreted from another language.
And to bend your mind a little more, years ago I read the history of a Pascal compiler written as a grad student project. It written in Pascal and compiled with the system's built-in Pascal compiler. Eventually, it was good enough to replace the system's built-in Pascal compiler. Unfortunately, they found a bug in code generation, but the fix for the code generator triggered the bug in the compiler, generating a bad compiler. To fix it required hand-patching the binaries from the installed compiler to then apply the patch to the source to replace itself.
The first pass of the compiler is normally written in something else until the language is well-formed enough to be able to compile it's own compiler, then you can get into the x is written in x.
It's only a problem for the very first version ever. Once I have V1.0 of the compiler working I can write V2.0 in my language and use the V1.0 compiler to compile it. Then I can write V3.0 and use V2.0 to compile that, use V3.0 to compile V4.0 and so on.
At the very beginning, the real first compiler of that language, was written not in that language of course. Very second could be written in that language. Moreover, given a spec of a language, you can implement a basic core in a bootstrap compiler, and then write the full compliant compiler in that language using the subset understood by the "bootstrap" compiler. Second generation compilers can forget "bootstrap" compiler too.
At some point, you need a compiler (or interpreter) written in a different language. But it doesn't need to be efficient and can be done in a language that makes parsing and prototyping easy (LISP is popular). Once you have used this to compile the "self-compiler", you can discard it and use the result.