views:

2182

answers:

12

I often find that the headers section of a file get larger and larger all the time but it never gets smaller. Throughout the life of a source file classes may have moved and been refactored and it's very possible that there are quite a few #includes that don't need to be there and anymore. Leaving them there only prolong the compile time and adds unnecessary compilation dependencies. Trying to figure out which are still needed can be quite tedious.

Is there some kind of tool that can detect superfluous #include directives and suggest which ones I can safely remove?
Does lint do this maybe?

+11  A: 

I thought that PCLint would do this, but it has been a few years since I've looked at it. You might check it out.

I looked at this blog and the author talked a bit about configuring PCLint to find unused includes. Might be worth a look.

itsmatt
Good find! I'll have to use that.
crashmstr
nice one. too bad PC-lint isn't free...
shoosh
I use PCLint regularly and it does tell me of unused headers. I'm careful to comment out the header #include and re-compile to be sure that the header is truly unused...
Harold Bamford
Thanks for the confirmation, Harold.
itsmatt
+12  A: 

The problem with detecting superfluous includes is that it can't be just a type dependency checker. A superfluous include is a file which provides nothing of value to the compilation and does not alter another item which other files depend. There are many ways a header file can alter a compile, say by defining a constant, redefining and/or deleting a used macro, adding a namespace which alters the lookup of a name some way down the line. In order to detect items like the namespace you need much more than a preprocessor, you in fact almost need a full compiler.

Lint is more of a style checker and certainly won't have this full capability.

I think you'll find the only way to detect a superfluous include is to remove, compile and run suites.

JaredPar
None of this will be an issue if the include files are laid out well. If you ever need to include file A before file B, you're doing it wrong (and I've worked on projects where they did it wrong).
David Thornley
@David, yes but that depends on the years of devs before you doing it correctly. I can say with great certainty that the odds of that happening favor the house, not you :(
JaredPar
Yes, but I generally find out about that when modifying a program, and suddenly I've got a compilation error (if I'm lucky) or an obscure bug. That seems to keep the #include files honest, at least in the long run.
David Thornley
I'd say the exact contrary. All you need is a type dependency checker. It might not compile after you've arranged includes accordingly, but these are problems that should be dealt with anyway.
Benoît
@Benoit, then you would be ignoring a class of issues that compile but semantically change the meaning of your program. Consider how a #define in one file can alter a #if branch in another. Removing a header can still allow this to compile with different results
JaredPar
Dan
A: 

Gimpel Software's PC Lint can report on when an include file has been included more than once in a compilation unit, but it can't find include files which are not needed in the way you are looking for.

Edit: It can. See itsmatt's answer

crashmstr
Are you *sure* about that? I haven't used FlexeLint (same as PCL) in a few years on C++ code, but even recently on C code, I could swear I saw a couple messages (I think it's code 766?) about unused header files. Just checked (v8.0), see section 11.8.1. of manual.
Dan
Yeah, see my edit pointing to itsmatt's updated answer.
crashmstr
+5  A: 

You can write a quick script that erases a single #include directive, compiles the projects, and logs the name in the #include and the file it was removed from in the case that no compilation errors occurred.

Let it run during the night, and the next day you will have a 100% correct list of include files you can remove.

Sometimes brute-force just works :-)


edit: and sometimes it doesn't :-). Here's a bit of information from the comments:

  1. Sometimes you can remove two header files separately, but not both together. A solution is to remove the header files during the run and not bring them back. This will find a list of files you can safely remove, although there might a solution with more files to remove which this algorithm won't find. (it's a greedy search over the space of include files to remove. It will only find a local maximum)
  2. There may be subtle changes in behavior if you have some macros redefined differently depending on some #ifdefs. I think these are very rare cases, and the Unit Tests which are part of the build should catch these changes.
Gilad Naor
Be careful of this - say there are two header files which both include a definition of something. You can remove either, but not both. You'll need to be a bit more thorough in your brute force approach.
Dominic Rodger
Maybe this is what you meant, but a script that removes a single include, and leaves the last removed include out if it was successfully removed would do the trick.
Dominic Rodger
Bad idea. If a header file #defines a constant BLAH and another header file checks #ifdef BLAH, removing the first header file may still successfully compile but your behaviour has changed.
Graeme Perrow
Didn't think about it too much, but you are correct. God is in the details, as they say.I'm not sure leaving the .h file out at each step is sufficient - this is still a Greedy search (capitol G), which doesn't always find the optimal solution. It will give a CORRECT solution, unlike the original.
Gilad Naor
@Graeme - Hopefully the Unit-Tests, which are part of the build (hopefully) will catch such faults.
Gilad Naor
This also can cause problems with system headers, since different implementations might have different things included in #include <vector>. Even if you stick to one compiler, the headers could change over different versions.
David Thornley
This won't find cases where you're including a header that includes the header that you really need.
bk1e
+9  A: 

It's not automatic, but Doxygen will produce dependancy diagrams for #included files. You will have to go through them visually, but they can be very useful for getting a picture of what is using what.

anon
This is a great way to see chains.. seeing A -> B -> C -> D and A -> D immediately reveals the redundancy.
Tom
+6  A: 

Google's cppclean can find several categories of C++ problems, but it can't (yet) find superfluous #includes. It's at least worth keeping an eye on for future developments.

Josh Kelley
+4  A: 

The CScout refactoring browser can detect superfluous include directives in C (unfortunately not C++) code. You can find a description of how it works in this journal article.

Diomidis Spinellis
+3  A: 

Sorry to (re-)post here, people often don't expand comments.

Check my comment to crashmstr, FlexeLint / PC-Lint will do this for you. Informational message 766. Section 11.8.1 of my manual (version 8.0) discusses this.

Also, and this is important, keep iterating until the message goes away. In other words, after removing unused headers, re-run lint, more header files might have become "unneeded" once you remove some unneeded headers. (That might sound silly, read it slowly & parse it, it makes sense.)

Dan
I know exactly what you mean, and my reaction was "Ewwww". I hate code like that.
David Thornley
+2  A: 

I've tried using Flexelint (the unix version of PC-Lint) and had somewhat mixed results. This is likely because I'm working on a very large and knotty code base. I recommend carefully examining each file that is reported as unused.

The main worry is false positives. Multiple includes of the same header are reported as an unneeded header. This is bad since Flexelint does not tell you what line the header is included on or where it was included before.

One of the ways automated tools can get this wrong:

In A.hpp:

class A { 
  // ...
};

In B.hpp:

#include "A.hpp

class B {
    public:
        A foo;
};

In C.cpp:

#include "C.hpp"  

#include "B.hpp"  // <-- Unneeded, but lint reports it as needed
#include "A.hpp"  // <-- Needed, but lint reports it as unneeded

If you blindly follow the messages from Flexelint you'll muck up your #include dependencies. There are more pathological cases, but basically you're going to need to inspect the headers yourself for best results.

I highly recommend this article on Physical Structure and C++ from the blog Games from within. They recommend a comprehensive approach to cleaning up the #include mess:

Guidelines

Here’s a distilled set of guidelines from Lakos’ book that minimize the number of physical dependencies between files. I’ve been using them for years and I’ve always been really happy with the results.

  1. Every cpp file includes its own header file first. [snip]
  2. A header file must include all the header files necessary to parse it. [snip]
  3. A header file should have the bare minimum number of header files necessary to parse it. [snip]
Ben Martin
Lakos's book is great for education -- aside from his outdated observations on compiler technology.
Tom
+1  A: 

This article explains a technique of #include removing by using the parsing of Doxygen. That's just a perl script, so it's quite easy to use.

Steve Gury
+1  A: 

I've never found a full-fledged tool that accomplishes what you're asking. The closest thing I've used is IncludeManager, which graphs your header inclusion tree so you can visually spot things like headers included in only one file and circular header inclusions.

Dan Olson
A: 

Maybe a little late, but I once found a WebKit perl script that did just what you wanted. It'll need some adapting I believe (I'm not well versed in perl), but it should do the trick:

http://trac.webkit.org/browser/branches/old/safari-3-2-branch/WebKitTools/Scripts/find-extra-includes

(this is an old branch because trunk doesn't have the file anymore)

rubenvb