views:

838

answers:

4

I know PC-Lint can tell you about headers which are included but not used. Are there any other tools that can do this, preferably on linux?

We have a large codebase that through the last 15 years has seen plenty of functionality move around, but rarely do the leftover #include directives get removed when functionality moves from one implementation file to another, leaving us with a pretty good mess by this point. I can obviously do the painstaking thing of removing all the #include directives and letting the compiler tell me which ones to reinclude, but I'd rather solve the problem in reverse - find the unused ones - rather than rebuilding a list of used ones.

+1  A: 

As far as I know, there isn't one (that isn't PC-Lint), which is a shame, and surprising. I've seen the suggestion to do this bit of pseudocode (which is basically automating your "painstaking process":

for every cpp file
for every header include
comment out the include
compile the cpp file
if( compile_errors )
un-comment out the header
else
remove header include from cpp

Put that in a nightly cron, and it should do the job, keeping the projcet in question free of unused headers (you can always run it manually, obviously, but it'll take a long time to execute). Only problem is when not including a header doesn't generate an error, but still produces code.

Cinder6
That still unfortunately doesn't clean up headers that include other headers that aren't required (and worse, may cause some "programming by coincidence" in other implementation files that get the headers they need through other headers that actually don't need them). It at least reduces the number of spurious includes in cpp files, but I would like to eliminate them in other headers as well.
Nick Bastin
It's also unadvisable to remove every header. Consider #include <vector> and #include <algorithm>. In some implementations of vector algorithm will be included, but that isn't guaranteed. Robust code should include both (if their both used). Your described method could remove #include <algorithm> depending on the implementation of vector.
caspin
This is true. Nick, are you more concerned with local header files (or do you at least have a lot of them)? If so, you could modify the above algorithm to not mess with library headers, and tune those manually. It's a pain, but it would cut the work down, at least.
Cinder6
+1  A: 

I've done this manually and its worth it in the short (Oh, is it the long term? - It takes a long time) term due to reduced compile time:

  1. Less headers to parse for each cpp file.
  2. Less dependencies - the whole world doesn't need re-compiling after a change to one header.

Its also a recursive process - each header file that stays in needs examining to see if any header files it includes can be removed. Plus sometimes you can substitute forward declarations for header includes.

Then the whole process needs repeating every few months/year to keep on top of leftover headers.

Actually, I'm a bit annoyed with C++ compilers, they should be able to tell you what's not needed - the Microsoft compiler can tell you when a change to a header file can be safely ignored during compilation.

quamrana
+4  A: 

DISCLAIMER: My day job is working for a company that develops static analysis tools.

I would be surprised if most (if not all) static analysis tools did not have some form of header usage check. You could use this wikipedia page to get a list of available tools and then email the companies to ask them.

Some points you might consider when you're evaluating a tool:

For function overloads, you want all headers containing overloads to be visible, not just the header that contains the function that was selected by overload resolution:

// f1.h
void foo (char);

// f2.h
void foo (int);


// bar.cc
#include "f1.h"
#include "f2.h"

int main ()
{
  foo (0);  // Calls 'foo(int)' but all functions were in overload set
}

If you take the brute force approach, first remove all headers and then re-add them until it compiles, if 'f1.h' is added first then the code will compile but the semantics of the program have been changed.

A similar rule applies when you have partial and specializations. It doesn't matter if the specialization is selected or not, you need to make sure that all specializations are visible:

// f1.h
template <typename T>
void foo (T);

// f2.h
template <>
void foo (int);

// bar.cc
#include "f1.h"
#include "f2.h"


int main ()
{
  foo (0);  // Calls specialization 'foo<int>(int)'
}

As for the overload example, the brute force approach may result in a program which still compiles but has different behaviour.

Another related type of analysis that you can look out for is checking if types can be forward declared. Consider the following:

// A.h
class A { };

// foo.h
#include "A.h"
void foo (A const &);

// bar.cc
#include "foo.h"

void bar (A const & a)
{
  foo (a);
}

In the above example, the definition of 'A' is not required, and so the header file 'foo.h' can be changed so that it has a forward declaration only for 'A':

// foo.h
class A;
void foo (A const &);

This kind of check also reduces header dependencies.

Richard Corden
Most that I have looked at do not have a header usage check of this nature. You make a very good point about overloads and specializations, but thankfully our conventions are such that these would basically never be in different headers.
Nick Bastin
Also, I've been down the road with that wikipedia page. The C/C++ section is very weak...I suppose I should go down the list of commercial providers and see which ones support C++. Also, I'm perfectly fine with people suggesting their own product - it's more than I had to go on before, and your advice in general is very informative.
Nick Bastin
+1  A: 

Have a look at the Dehydra.

From the website:

Dehydra is a lightweight, scriptable, general purpose static analysis tool capable of application-specific analyses of C++ code. In the simplest sense, Dehydra can be thought of as a semantic grep tool.

It should be possible to come up with a script that checks for unused #include files.

Ton van den Heuvel