Is reducing number of cpp translation units a good idea?

views:

392

answers:

+11 Q:

Is reducing number of cpp translation units a good idea?

I find that if there are a lot of classes the compilation time is dramatically increased when I use one *.h and one *.cpp file per class. I already use precompiled headers and incremental linking, but still the compile time is very long (yes I use boost ;)

So I came up with the following trick:

defined *.cpp files as non-compilable
defined *.cxx files as compilable
added one *.cxx file per application module, and #included all the *.cpp files of this module in it.

So instead of 100+ translation units I ended up with only 8 translation units. The compile time became 4-5 times shorter.

The downsides are that you have to manually include all the *.cpp files (but it's not really a maintenance nightmare since if you forget to include something the linker will remind you), and that some VS IDE conveniences are not working with this scheme, e.g. Go To/ Move to Implementation etc.

So the question is, is having lots of cpp translation units really the only true way? Is my trick a known pattern, or maybe I'm missing something? Thanks!

+4 A:

I've seen what you do in video games since it helps the compiler to do optimizations it otherwise couldn't do as well as save a lot of memory. I've seen "uber build" and "bulk build" refer to this idea. And if it helps speed up your build, why not..

Jim Buck 2009-05-14 05:44:30

+4 A:

One significant drawback of this approach is caused by having one .obj file for each translation unit.

If you create a static library for reuse in other projects you will often face bigger binaries in those projects if you have several huge translation units instead of many small ones because the linker will only include the .obj files containing the fucntions/variables really referenced from within the project using the library.

In case of big translation units it's more likely that each unit is referenced and the corresponding .obj file is included. Bigger binaries may be a problem in some cases. It's also possible that some linkers are smart enough to only include the necessary functions/variables, not the whole .obj files.

Also if the .obj file is included and all the global variables are included too then their constructors/destructors will be called when the program is started/stopped which surely will take time.

sharptooth 2009-05-14 05:53:30

Visual Studio supports function-level linking. Still relevant for Unix/Linux, though.

MSalters 2009-05-14 06:27:41

Will function-level linking exclude only functions or variables too?

sharptooth 2009-05-14 06:42:09

+1 A:

Bigger and fewer translation units do not take advantage of parallel compilation. I don't know what compilers and what platforms you are using, but compiling in parallel multiple translation units might decrease significantly the building time...

Cătălin Pitiș 2009-05-14 06:27:10

That's not necessarily true - the OP mentions that he's using Visual Studio and the latest versions can use multiple processors to compile a single source file if you add the appropriate command line parameter.

Timo Geusch 2009-05-14 06:30:20

It's not guaranteed that the speed up is the same. Knowing how to use multiple threads to compile a single source file is a hard problem, I doubt it's as efficient as being able to distribute the build with multiple separate compilations.

Joseph Garvin 2009-10-29 17:54:40

+2 A:

Bundling a larger number of C++ source code files into a single file is an approach that has been mentioned a few times recently, especially when people were building large systems and pulling in complicated header files (that'll be boost, then).

As you mention VS, I found that the number of include files in a project and especially the size of the include path seems to affect Visual C++'s compilation times far more than it does g++'s compilation times. This is especially the case with lots of nested includes (again, boost does that) as a large number of file searches are necessary to find all include files required by the source code. Combining the code into a single source file means that the compiler can be much smarter about finding said includes, plus there are obviously fewer of them to be found as you would expect that the files in the same subproject would be likely to include a very similar set of header files.

The "lots of compilation units" approach to C++ development usually comes from a desire to decouple classes and minimise dependencies between classes so the compiler only has to rebuild the minimal set of files in case you make any changes. This is generally a good approach but often not really feasible in a subproject simply because the files in there have dependencies on each other so you'll end up with quite large rebuilds anyway.

Timo Geusch 2009-05-14 06:28:42

Following on from sharptooths post, I'd tend to examine the resultant executables in some detail. If they are different, I'd tend to limit your technique to debug builds and resort to the original project config for the main release build. When checking the executable, I'd also look at its memory footprint and resource usage at startup and while running.

Shane MacLaughlin 2009-05-14 06:31:45

+2 A:

I'm not sure if this is relevant in your case but maybe you can use declaration instead definition to reduce the number of #include's that you have to do. Also, maybe you can use the pimpl idiom for the same purpose. That would hopefully reduce the number of source files that need to be recompiled each time and the number of headers that have to be pulled in.

Robert S. Barnes 2009-05-14 06:55:54

+3 A:

I don't think that reduction of number of compilation units is a good idea. Your are trying to solve a problem with big compilation time, and this approach seems to help with it, but what you get in addition:

Increased compilation time during development. Usually developer modify few files at a time, and compilation will be probably faster for 3-4 small files then for one very big file.
As you mentioned, harder to navigate code, IMHO this is extremely important.
You can have some interference between .cpp files included into one .cxx file:

a. It is common practice to define locally in cpp file (for debug builds) macro new for memory leak check. Unfortunately this cannot be done before including headers using placement new (as some STL and BOOST header do)

b. It is common practice to add using declarations in cpp files. With your approach this may lead to problems with headers, included later

c. Name conflicts are more probable

IMHO, much cleaner (but maybe more expensive way) to speed up compilation time is to use some distributed build system. They are especially effective for clean builds.

Konstantin 2009-05-14 07:04:34

+1 A:

The concept is called unity build

Suma 2009-05-14 07:11:44

ansaurus

tags:

views:

answers:

Is reducing number of cpp translation units a good idea?

related questions