views:

137

answers:

3

When I declare a function in a header file, and put the definition of that function in some other file, how does the compiler/linker find the definition? Does it systematically search every file in its path for it, or is there a more elegant solution? This has been bugging me for the past few days, and I've been unable to find an explanation for it.

+11  A: 

The compiler doesn't do this, the linker does.

While the compiler works on one source file at a time, when the linker is invoked it is passed the names of all of the object files generated by the compiler, as well as any libraries that the user wishes to have linked in. Therefore, the linker has complete knowledge of the set of files that could potentially contain the definition, and it needs only to look in the symbol tables of those object files. It doesn't need to do any searching beyond that.

For example, say you have foo.h and foo.c defining and implementing function foo(), and bar.h and bar.c defining and implementing bar(). Say bar calls foo so that bar.c includes foo.h. There are three steps to this compilation:

gcc -c foo.c
gcc -c bar.c
gcc foo.o bar.o -o program

The first line compiles foo.c, producing foo.o. The second compiles bar.c, producing bar.o. At this point, in the object file bar.o, foo is an external symbol. The third line invokes the linker, which links together foo.o and bar.o into an executable called "program". When the linker processes bar.o, it sees the unresolved external symbol foo and so it looks in the symbol table of all of the other object files being linked (in this case just foo.o) and finds foo in foo.o, and completes the link.

With libraries this is a bit more complicated, and the order that they appear on the command line can matter depending on your linker, but it's generally the same principle.

Tyler McHenry
Good answer - it's worth adding that at least one library - the standard C library - is *implicitly* included in the link line. So in this case, it would look not just in `foo.o` but also in the standard C library for unresolved symbols.
caf
*And*, when linking with g++ you'll get -lstdc++ by default as well.
Bklyn
+9  A: 

When you compile a .cpp file, the compiler outputs two tables in the .obj file: a list of symbols that it expects to be defined externally, as well as a list of symbols that are defined in that particular module.

The linker takes all of the .obj files that were output by the compiler and then (as the name suggests) links them all together. So for each module, it looks at the list of symbols that are marked "defined externally" and looks through all of the other modules it was given for those symbols.

So it only ever "searches" the modules that you told it search in.

If it can't find the symbol in any of the other modules, that's when you get the "undefined reference" error.

Dean Harding
A: 

Assume you have a foo.cpp with an #include foo.h and maybe other includes. Headers can of course have their own #include-s.

The preprocessor will start with the foo.cpp, parse the #includes and read the header content. The result will be text from the header files and foo.cpp "flattened". The compiler will then work off that text. If a variable/function/etc should have been declared somewhere in a header wasn't found, the compiler will report an error.

The basic point is the compiler has to see all its declarations as a result of the .cpp and headers.

seand