views:

150

answers:

3

I've read that the gcc compiler can perform certain optimization when compiling an application that references a static library, for instance - it will "pull" in only that code from the static library that the application depends upon. This helps keep the size of the application's executable to a minimum if portions of the static library are not being used by the app.

1) Is this true?

2) How does GCC know what code from the static library the application is actually using? Does it only look t the header files that are included (directly and indirectly) in the application and then pull code accordingly? Or does it actually look at what methods from the static library are being called?

+2  A: 

A static library is just a bag of object files. The linker (ld) will keep track of which object files are used (i.e. contains a function referenced from somewhere), and not include unreferenced code in the final executable image.

JesperE
+1  A: 

gcc does nothing of the sort. Everything you describe is linking, which is handled by ld.

ld examines the symbol tables of the object files in order to determine which symbols need to be linked, and then pulls the relevant object files from the libraries and links them into the executable.

Ignacio Vazquez-Abrams
+1  A: 

Answers
1) Yes, only the code referenced will be pulled in. Besides the smaller size there is also a gain in link speed since the static library contains a index table of all the symbols exported by the library. It is quicker doing lookups in this table as opposed to looking up in object files one by one.
Alternatively, if you wanted to pull in all the symbols in the static library regardless of reference. You can pass the --whole-archive switch to ld.

2) It would be more correct to ask this question in the context of ld (the gnu linker) since that is what actually pulls in the references. GCC just invokes the linker after its done compiling (unless you do gcc -c, which causes it to stop after compilation). So, after compilation is done, ld is invoked with a ordered list of object(.o) files and libraries . ld processes the .o files one by one, and for each the linker a) Notes down the external symbols needed by this file that cannot be resolved yet. Adds these to a (say) unresolved table. b) Looks at the symbols (functions, global variables) exported by this file and resolves any previous refrences that it can. This is a very simplified overview of the linking process. Now when the linker comes to the static library, it essentially does the same thing, this time using the static library to resolve symbols. However there is one difference, the linker pulls in only the unresolved symbols and its dependencies. So assume we have a.o and libstatic.a which in turn contains b.o and c.o.
b.o defines bar() and moreBar();
c.o defines baz() and moreBaz();
a.o defines foo();
where foo calls bar which calls baz. Now when you do gcc -o app a.o libstatic.a After processing a.o the linker knows that it needs to resolves bar, this gets resolved from the static library, however while resolving bar the linker notices that bar needs baz. This again gets resolved from libstatic.a. moreBar() and moreBaz() have no references and get ignored.

Jasmeet