views:

360

answers:

7

I've read everything I could find on this topic, including a couple of very helpful discussions on this site, the NASA coding guidelines and Google C++ guidelines. I even bought the "physical C++ design" book recommended on here (sorry, forgot the name) and got some useful ideas from that. Most sources seem to agree - header files should be self-contained, i.e. they include what they need so that a cpp file could include the header without including any others and it would compile. I also get the point about forward declaring rather than including whenever possible.

That said, how about if foo.cpp includes bar.h and qux.h, but it turns out that bar.h itself includes qux.h? Should foo.cpp then avoid including qux.h? Pro: cleans up foo.cpp (less "noise"). Con: if someone changes bar.h to no longer include qux.h, foo.cpp mysteriously starts failing to compile. Also causes the dependency between foo.cpp and qux.h not to be obvious.

If your answer is "a cpp file should #include every header it needs", taken to its logical conclusion that would mean that pretty much every cpp file has to #include <string>, <cstddef> etc. since most code will end up using those, and if you're not supposed to rely on some other header including them, your cpp needs to include them explicitly. That seems like a lot of "noise" in the cpp files.

Thoughts?

Previous discussions:

http://stackoverflow.com/questions/188449/what-are-some-techniques-for-limiting-compilation-dependencies-in-c-projects

http://stackoverflow.com/questions/181921/your-preferred-cc-header-policy-for-big-projects

http://stackoverflow.com/questions/51561/how-do-i-automate-finding-unused-include-directives

ETA: Inspired by previous discussions on here, I've written a Perl script to successively comment out each 'include' and 'using', then attempt to recompile the source file, to figure out what's not needed. I've also figured out how to integrate it with VS 2005, so you can double-click to go to "unused" includes. If anyone wants it let me know...very much experimental right now though.

+6  A: 

If your answer is "a cpp file should #include every header it needs", taken to its logical conclusion that would mean that pretty much every cpp file has to #include <string>, <cstddef> etc. since most code will end up using those, and if you're not supposed to rely on some other header including them, your cpp needs to include them explicitly.

Yup. That's the way I prefer it.

If the 'noise' is too much to bear, it's OK to have a 'global' include file that contains the usual, common set of includes (like stdafx.h in a lot of Windows programs), and include that at the start of each .cpp file (that helps with precompiled headers, too).

Michael Burr
Actually, the stdafx thing is one pattern I'm trying to get away from. One issue is that our code must be cross platform, and as far as I can see pch's don't work the same way in VC++ and gcc. Also, we ended up with an "n squared" include complexity since everything ended up including everything, leading to long compile times. I guess we could put STL includes in a common place, but that means every compilation unit gets them whether it wants or not. I've also noticed that the STL headers include each other, so if you include (e.g.) 'string', you may not need 'memory' (just an example).
Darren Sargent
I can understand why you wouldn't want to use the stdafx.h pattern, but it can work and it's a pattern that I don't consider 'bad form'. Also, I wouldn't advocate throwing everything and the kitchen sink into the stdafx header - just the headers that **really** are used pretty much everywhere. The .cpp file can still include the other headers that are less frequently needed after the stdaf.h file. And I don't think I'd worry about multiple inclusion of otherwise necessary headers - they shouldn't affect build times too much (the subsequent include of a file won't get processed).
Michael Burr
+2  A: 

I think that you should still include both files. This helps with maintaining the code.

// Foo.cpp

#include <Library1>
#include <Library2>

I can read that and easily see which libraries it uses. If Library2 used Library1, and it was transformed to this:

// Foo.cpp

#include <Library2>

But I still saw Library1 code, I might be slightly confused. It's not hard to guess that some other library must be including it, but it's still a thought process that has to occur.

Being explicit means I don't have to guess, even for an extra micro second of compilation.

GMan
Would you extend this to STL libraries as well? e.g. if a cpp file uses "string", you'd want to see the include? In every single file that uses "string"? I can definitely see your reasoning for user headers, but looking at my code, most source files will have to have a bunch of STL includes at the top.
Darren Sargent
Wait, what else would you do with them? If your class uses strings in the header, it would go there, but if it was just the implementation that sued it, it would go in the cpp. Each cpp *must* include the header it needs.
GMan
A: 

Each header file provides definitions needed to use a service of some type (a class definition, some macros, whatever). If you are directly using the services exposed through a header file, include that header file even if another header file also includes it. On the other hand, if you are using those services only indirectly, only in the context of the other header file's services, don't include the header file directly.

In my opinion this isn't a big deal either way. The second inclusion of a header file doesn't get fully parsed; it's basically commented out for all but the first time it's included. As things change and you find out you were using a header file that you weren't directly including, it's not a big deal to add it.

Darryl
+1  A: 

I think you should include both until it becomes unbearable. Then refactor your classes and source files to reduce coupling, on grounds that if just listing all your dependencies is onerous, then you probably have too many dependencies...

A compromise might be to say that if something in bar.h contains a class definition (or function declaration) which necessarily requires another class definition from qux.h, then bar.h can be assumed/required to include qux.h, and the user of the API in bar.h needn't bother including both.

So, suppose the client wants to think of itself as "really" only being interested in the bar API, but has to use qux too, because it calls a bar function that takes or returns a qux object by value. Then it can just forget that qux has its own header and imagine that it's one big bar API defined in bar.h, with qux just being a part of that API.

This means you can't count on, for example, searching cpp files for mentions of qux.h to find all clients of qux. But I'd never rely on that anyway, since in general it's too easy to accidentally miss explicitly including a dependency that's already indirectly included, and never realise you haven't listed all the relevant headers. So you probably shouldn't in any case assume header lists are complete, but instead use doxygen (or gcc -M, or whatever) to get a complete list of dependencies, and search that.

Steve Jessop
Thanks...this is a good point. Rampant includes *could* indeed be a code smell, a sign of excessive coupling. I will definitely take a look at that.
Darren Sargent
A: 

how about if foo.cpp includes bar.h and qux.h, but it turns out that bar.h itself includes qux.h? Should foo.cpp then avoid including qux.h?

If foo.cpp uses anything from qux.h directly, then it should include this header itself. Otherwise, since bar.h needs qux.h, I would rely onbar.h including everything it needs.

sbi
A: 

One way to think about whether foo.cpp directly uses qux.h, is to think about what would happen if foo.cpp no longer needs to include bar.h. If foo.cpp would still need to include qux.h, then it should be listed explicitly. If it would no longer need anything from qux.h, it is probably no necessary to include it explicitly.

KeithB
A: 

The "never optimize without profiling" rule applies here, I think.

(This answer is biased toward "performance" over "clutter"; the clutter can generally be cleaned up at will, admittedly it gets harder later, but the performance impacts you on a regular basis.)

Try it both ways, see if there's any significant speed improvement for your compiler, and only then consider removing "duplicate" headers -- because as other answers point out, there's a long-term maintainability penalty you take by removing the duplicates.

Consider getting something like Sysinternals FileMon to see if you're actually generating filesystem hits at all for those duplicate includes (most compilers won't, with correct header guards).

Our experience here indicates that actively seeking out headers that are completely unused (or could be, with proper forward declarations) and removing them is far more worthy of time and effort than figuring out the include chains for possible duplicates. And a good lint (splint, PC-Lint, etc) can help you with this determination.

Even more worthy of our time and effort was figuring out how to get our compiler to handle multiple compilation units per execution (almost a linear speedup for us, up to a point -- the compilation was utterly dominated by compiler startup).

After that, you may want to consider the "single big CPP" madness. It can be quite impressive.

leander