Why does the order in which libraries are linked sometimes cause errors?
I would imagine it is because some of those libraries have dependencies on other libraries, and if they have not been linked yet then you would get linker errors.
In theory there is no requirement to link things in a specific order but it never seems to work out. If library A depends on library B then you may need to link library B in first so that the symbols will be found when gcc examines library A. Unless you know the locations of symbols very well you might have to play around a bit. The command nm will dump out a list of symbols in a library.
Is this something with only gnu ld/gcc? Or is this something common with linkers?
I have seen this a lot, some of our modules link in excess of a 100 libraries of our code plus system & 3rd party libs.
Depending on different linkers HP/Intel/GCC/SUN/SGI/IBM/etc you can get unresolved functions/variables etc, on some platforms you have to list libraries twice.
For the most part we use structured hierarchy of libraries, core, platform, different layers of abstraction, but for some systems you still have to play with the order in the link command.
Once you hit upon a solution document it so the next developer does not have to work it out again.
My old lecture used to say, "high cohesion & low coupling", it’s still true today.
The GNU ld linker is a so-called smart linker. It will keep track of the functions used by preceding libraries, permanently tossing out those functions that are not used from its lookup tables. The result is that if you link a library too early, then the functions in that library are no longer available to libraries later on the link line.
The typical UNIX linker works from left to right, so put all your dependent libraries on the left, and the ones that satisfy those dependencies on the right of the link line. You may find that some libraries depend on others while at the same time other libraries depend on them. This is where it gets complicated. When it comes to circular references, fix your code!
There is no requirements on the linking order of object files or dynamic libraries. Of course, a program that contains undefined behavior or depends on unspecified behavior could be affected by that order. For example, if the order of constructor invocations depends on the link order of object files, and the program accesses objects across translation unit boundaries, then it may access a not yet constructed object - because its translated translation unit was linked after another object file, which contains the other, not yet constructed object. Those effects, however, show up in defect programs - a correct program should not depend on such order.
There are no requirements regarding symbol resolution that could be affected by the link order of dynamic libraries or object files. Runtime effects that depends on that link order, as in the example above, could happen, but should not affect valid programs. Programs should use the tools of the toolchain to make sure they behave correctly. For example GCC has the possibility to control when a constructor runs, and can thereby order the priority of constructor calls at the initialization time of objects.
Anyway, static libraries are required to be linked in this order - otherwise, unresolved references will appear with GNU ld:
If any library A depends on symbols defined in library B, then library A should appear first in the list supplied to the linker
That can sometimes cause trouble in build-scripts that can be configured to link either way. The GNU linker has an option which causes it to resolve cyclic references between A and B:
-( archives -)
or--start-group archives --end-group
The archives should be a list of archive files. They may be either explicit file names, or -l options.The specified archives are searched repeatedly until no new undefined references are created. Normally, an archive is searched only once in the order that it is specified on the command line. If a symbol in that archive is needed to resolve an undefined symbol referred to by an object in an archive that appears later on the command line, the linker would not be able to resolve that reference. By grouping the archives, they all be searched repeatedly until all possible references are resolved.
Using this option has a significant performance cost. It is best to use it only when there are unavoidable circular references between two or more archives.
Note that archives
in that description refer to static libraries. I've taken it from the manpage of GNU ld (man ld
).
One occasion that could be useful is to link together the c runtime library with the gcc low-level support libraries. Here is what my gcc port uses to pass to the linker:
"-lgcc" "-lc" "-lsyscalls" "-lgcc"
Because functions in libgcc
may refer to functions defined in the C library, but functions in the C library may refer to functions defined in libgcc
as well (it contains such functions as floating point emulation code). Another way to solve that problem could have been to use the start-group and end-group mechanism.
Link order certainly does matter, at least on some platforms. I have seen crashes for applications linked with libraries in wrong order (where wrong means A linked before B but B depends on A).
I've written a blogpost about this problem: Link two static libraries in one application