I want to know how to design a compiler that compiles very, very quickly.
First, let me head off some obvious misunderstandings of my question:
I am not talking about the speed of the code produced by the compiler. There are already many resources available for learning how to optimize generated code. What I'm having trouble finding is information on making the compiler fast.
I'm also not interested in a discussion of why C++ compilers are generally slower than Java compilers (for example). I'm interested in what techniques can be used to be speed up the compiler for any given language.
I also don't want to hear about distributed compilation systems like Microsoft's Incredibuild or Unix's distcc. Those systems don't give you faster compilers, they just give you more compilers. Which is certainly useful, but that's not the question I'm asking. I want to know how to design a fast compiler for a single CPU.
Nor is ccache the answer I'm looking for. That's a system that allows you to avoid using the compiler at all, but it doesn't make the compiler faster. Again, that's useful; again, that's not the question I'm asking.
I hope my question is now crystal clear. But perhaps some history will make it even clearer.
C compilers used to be really slow. Then, in 1986, THINK Technologies introduced Lightspeed C for Macintosh, and it compiled programs almost instantaneously. Lightspeed C was so much faster than all the other C compilers that there was hardly any comparison. (Perhaps Lightspeed C wasn't the first of the new generation of lightning-fast compilers, but it was the first in my experience. Turbo Pascal came earlier [1983] but I had no experience with it, so I don't know how it compared, speed-wise.)
Since then, many fast compilers have been available. It seems that there was some kind of quantum leap in compiler technology in the 1980's, and that in particular is what I'm trying to understand. What was the breakthrough?
The answer may be this simple: with IDE's like Lightspeed and Turbo, the integrated editor already has the source code in RAM. If the compiler operates off that data, it eliminates disk I/O, which is the slowest part of any compiler. That's probably a very important contributor to the speed improvement, if the source code size is small relative to the memory size. (In those days, RAM sizes were much smaller, but then so were typical program sizes.)
Is that it? Or were there other important innovations involved? And have there been important improvements in compiler speed since then?