views:

1559

answers:

3

The only real use of the --whole-archive linker option that I have seen is in creating shared libraries from static ones. Recently I came across Makefile(s) which always use this option when linking with in house static libraries. This of course causes the executables to unnecessarily pull in unreferenced object code. My reaction to this was that this is plain wrong, am I missing something here ?

The second question I have has to do with something I read regarding the whole-archive option but couldn't quite parse. Something to the effect that --whole-archive option should be used while linking with a static library if the executable also links with a shared library which in turn has (in part) the same object code as the static library. That is the shared library and the static library have overlap in terms of object code. Using this option would force all symbols(regardless of use) to be resolved in the executable. This is supposed to avoid object code duplication. This is confusing, if a symbol is refereed in the program it must be resolved uniquely at link time, what is this business about duplication ? (Forgive me if this paragraph is not quite the epitome of clarity)

Thanks

A: 

I agree that using —whole-archive to build executables is probably not what you want (due to linking in unneeded code and creating bloated software). If they had a good reason to do so they should have documented it in the build system, as now you are left to guessing.

As to your second part of the question. If an executable links both a static library and a dynamic library that has (in part) the same object code as the static library then the —whole-archive will ensure that at link time the code from the static library is preferred. This is usually what you want when you do static linking.

lothar
+3  A: 

There are legitimate uses of --whole-archive when linking executable with static libraries. One example is building C++ code, where global instances "register" themselves in their constructors (warning: untested code):

// main.cc
typedef void (*handler)(const char *protocol);
typedef map<const char *, handler> M;
M m;

void register_handler(const char *protocol, handler) {
   m[protocol] = handler;
}
int main(int argc, char *argv[])
{
   for (int i = 1; i < argc-1; i+= 2) {
      M::iterator it = m.find(argv[i]);
      if (it != m.end()) it.second(argv[i+1]);
   }
}


// http.cc (part of libfoo.a)
class HttpHandler {
  HttpHandler() { register_handler("http", &handle_http); }
  static void handle_http(const char *) { /* whatever */ }
};
HttpHandler h; // registers itself with main!

Note that there are no symbols in foo.cc that main.cc needs. If you link this as

g++ main.cc -lfoo

you will not get an http handler linked into the main executable, and will not be able to call handle_http. Contrast this with what happens when you link as:

g++ main.cc -Wl,--whole-archive -lfoo -Wl,-no--whole-archive

The same "self registration" style is also possible in plain-C, e.g. with the __attribute__((constructor)) GNU extension.

Employed Russian
A: 

@Employed Russian Thanks for your answer. This-creation of globals which have desired sideffects- is exactly the reason that the original author seems to have used whole-archive for. The problem is that this way of linking against libraries was then carried over to all subsequent in house libraries by way of imitation. This has gotten us unneccessarily fat executabiles.

Jasmeet

Jasmeet