views:

152

answers:

4

What do you do when you have a set of .h files that has fallen victim to the classic 'gordian knot' situation, where to #include one .h means you end up including almost the entire lot? Prevention is clearly the best medicine, but what do you do when this has happened before the vendor (!) has shipped the library?

Here's an extension to the question, and this is probably the more pertinent question -- should you even attempt to disentangle the dependencies in the first place?;

+1  A: 

As you have the opportunity, you should refactor the code to reduce includes that are too large, however that assumes you can achieve some sort of package cohesion. If you disentangle things just to discover that every user of the code has to include all the elements anyway, the end result is the same.

Another option is to use #defines to configure sections on and off. Regardless, for an existing code base the solution is to move toward package cohesion.

Read: http://ivanov.files.wordpress.com/2007/02/sedpackages.pdf and research issues related to package cohesion.

caskey
+1  A: 

I've untangled that knot a few times, and it generally helps a lot when maintaining a system to reduce the .h dependencies as much as possible. There are decent tools for generating dependency trees ( I was using Klocwork at the time ).

The downside I found was with conditional compilation. Someone might remove a header file because they think we don't need it, but it turns out that we only don't need it because VxWorks has some screwed up headers... on Solaris (or any reasonable Posix system) you do need it.

Chris Arguin
+3  A: 

I've done this on a C++ code base that was already split into many libraries (which was a good start).

I had to workout (or guess) which library was the most depended upon, which depended upon nothing else in the code base. I then processed each library in turn.

I looked at each module (*.cpp files) in turn and made sure that its own header was #included first and commented out the rest, then I commented out all the #includes in that header file and then re-compiled just that module to let the compiler tell me what was needed. I would un-comment the first header that seemed to be needed, and reviewed that one, recursing as necessary. It was interesting to see how many headers ended up not being needed.

Where only the name is needed (because you have a pointer or reference) use 'class name;' or 'struct name;', which is called forward declaration and avoid #including the header file.

The compiler is very helpful in telling you what the dependencies are when you comment out #includes (you need to recompile with ALL the compilers you have to maintain portability).

Sometimes I had to move modules between libraries so that no pairs or groups of libraries were mutually dependant.

quamrana
+1 for forward declarations in header, full include in implementation file.
CiscoIPPhone
A: 

There is a balance to be struck between an enormous number of finely organized headers and a single header that includes everything. Consider the Standard C library; there are some biggish headers like <stdio.h>, which declares a lot of functions, but they are all related to I/O. There are other headers that are more of a miscellany - notably <stdlib.h>.

The Goddard Space Flight Center guidelines for C are worth hunting down.

The basic rule is that each header should declare the facilities provided by a suitable (usually small) set of source files. The facilities and header should be self-contained. That is, if someone needs the code in header "something.h", then that should be the only header that must be added to the compilation. If there are facilities needed by "something.h" that are not declared in the header, then it must include the relevant headers. That can mean that headers end up including <stddef.h> because one of the functions uses size_t, for example.

As @quamrana points out, you can use forward declarations for structures (not classes, since the question is tagged C and not C++) when appropriate - which primarily means when the interface takes pointers and does not need to know the size of the structures or any of the members.

Jonathan Leffler