views:

451

answers:

7

I am designing a programming language and one of the problems i was thinking was why do programming languages take long to compile. Assumed c++ takes a long time because it needs to parse and compile a header everytime it compiles a file. But i -heard- precompiled headers take as long? i suspect c++ is not the only language that has this problem.

+11  A: 

Compiling is a complicated process which involves quite a few steps:

  • Scanning/Lexing
  • Parsing
  • Intermediate code generation
  • Possibly Intermediate code optimization
  • Target Machine code generation
  • Optionally Machine-dependent code optimization

(Leaving aside linking.)

Naturally, this will take some time for longer programs.

Rob Lachlan
+2  A: 

They take as long as they take and it usually depends on how much extraneous stuff you inject into your compilation units. I'd like to see you hand-compile them any faster :-)

The first time you compile a file, you should have no headers at all. Then add them as you need them (and check when you're finished whether you still need them).

Other ways of reducing that time is to keep your compilation units small (even to the point of one function per file, in an extreme case) and use a make-like tool to ensure you only build what's needed.

Some compilers (IDE's really) do incremental compilation in the background so that they're (almost) always close to fully-compiled.

paxdiablo
++ for smaller compiler units
Thilo
+6  A: 

After you finish writing the compiler for that language you're designing, you'll know exactly why.

Paul Beckingham
a smartass answer. meh, i didnt rank you up or down.
acidzombie24
Yes, it was a smartass answer, but if someone is unaware of the extent of the complexities of a compiler, there's nothing like taking on such a project to vividly show in the enormity of the task.
Paul Beckingham
+2  A: 

Language design does have an effect on compiler performance. C++ compilers are typically slower than C# compilers, which has a lot to do with the design of the language. (This also depends on the compiler implementer, Anders Hejlsberg implemented C# and is one of the best around.)

The simplistic "header file" structure of C++ contributes to its slower performance, although precompiled headers can often help. C++ is a much more complex language than C, and C compilers are therefore typically faster.

Greg Hewgill
Are you sure Anders was involved in the implementation of the compiler? I thought he was more involved from a high-level stand point, and not down in the compiler code, but this is just speculation on my part.
Travis
I would be surprised if he wasn't involved in at least part of the actual implementation, but I don't know for sure. He's definitely involved in the language design though.
Greg Hewgill
Delphi and Python come to mind as being exceptionally fast compilers...
Arafangion
Anders was also the chief engineer behind Delphi (and Turbo Pascal before that).
Greg Hewgill
+3  A: 

One C++ specific problem that makes it horribly slow is that, unlike almost any other language, you can't parse it independently of semantic analysis.

dsimcha
+1  A: 

Precompiled headers are way faster, as has been known at least since 1988.

The usual reason for a C compiler or C++ compiler to take a long time is that it has to #include, preprocess, and then lex gazillions of tokens.

As an exercise you might find out how long it takes just to run cpp over a typical collection of header files---then measure how long it takes to lex the output.

gcc -O uses a very effective but somewhat slow optimization technique developed by Chris Fraser and Jack Davidson. Most other optimizers can be slow because they involve repeated iteration over fairly large data structures.

Norman Ramsey
Would it be fair to say that trying to stick a lot of intelligence into the optimization is a sure way to create long compile times? This has been my casual understanding, but I'm loath to make a blanket statement...
dmckee
A: 

Compilation does not need to take long: tcc compiles ANSI c fast enough to be useful as an interpreter.

Some thing to think about:

  1. Complexity in the scanning and parsing passes. Presumably requiring long look-aheads will hurt, as will contextual (as opposed to context-free) languages.
  2. Internal representation. Building and working on a large and featureful AST will take some time. Presumably you should use the simplest internal representation that will support the features you want to implement.
  3. Optimization. Optimization is fussy. You need to check for a lot of different conditions. You probably want to make multiple passes. All of this is going to take time.
dmckee