tags:

views:

1891

answers:

5

I don't understand how GCC works under Linux. In a source file, when I do a:

#include <math.h>

Does the compiler extract the appropriate binary code and insert it into the compiled executable OR does the compiler insert a reference to an external binary file (a-la Windows DLL?)

I guess a generic version of this question is: Is there an equivalent concept to Windows DLLs under *nix?

+1  A: 

The compiler is allowed to do whatever it pleases, as long as, in effect, it acts as if you'd included the file. (All the compilers I know of, including GCC, simply include a file called math.h.)

And no, it doesn't usually contain the function definitions itself. That's libm.so, a "shared object", similar to windows .DLLs. It should be on every system, as it is a companion of libc.so, the C runtime.

Edit: And that's why you have to pass -lm to the linker if you use math functions - it instructs it to link against libm.so.

aib
+1  A: 

There is. The include does a textual include of the header file (which is standard C/C++ behavior). What you're looking for is the linker . The -l argument to gcc/g++ tells the linker what library(ies) to add in. For math (libm.so), you'd use -lm. The common pattern is:

  • source file: #include <foo.h>
  • gcc/g++ command line: -lfoo
  • shared library: libfoo.so

math.h is a slight variation on this theme.

Harper Shelby
+2  A: 

Is there an equivalent concept to Windows DLLs under *nix?

Yes they are called "Shared Objects" or .so files. They are dynamically linked into your binary at runtime. In linux you can use the "ldd" command on your executable to see which shared objects your binary is linked to. You can use ListDLLs from sysinternals to accomplish the same thing in windows.

grepsedawk
+23  A: 

Well. When you include math.h the compiler will read the file that contains declarations of the functions and macros that can be used. If you call a function declared in that file (header), then the compiler inserts a call instruction into that place in your object file that will be made from the file you compile (let's call it test.c and the object file created test.o). It also adds an entry into the relocation table of that object-file:

Relocation section '.rel.text' at offset 0x308 contains 1 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
0000001c  00000902 R_386_PC32        00000000   bar

This would be a relocation entry for a function bar. An entry in the symbol table will be made noting the function is yet undefined:

9: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND bar

When you link the test.o object file into a program, you need to link against the math library called libm.so . The so extension is similar to the .dll extension for windows. It means it is a shared object file. The compiler, when linking, will fix-up all the places that appear in the relocation table of test.o, replacing its entries with the proper address of the bar function. Depending on whether you use the shared version of the library or the static one (it's called libm.a then), the compiler will do that fix-up after compiling, or later, at runtime when you actually start your program. When finished, it will inject an entry in the table of shared libraries needed for that program. (can be shown with readelf -d ./test):

Dynamic section at offset 0x498 contains 22 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [libm.so.6]
 0x00000001 (NEEDED)                     Shared library: [libc.so.6]
 ... ... ...

Now, if you start your program, the dynamic linker will lookup that library, and will link that library to your executable image. In Linux, the program doing this is called ld.so. Static libraries don't have a place in the dynamic section, as they are just linked to the other object files and then they are forgotten about; they are part of the executable from then on.

In reality it is actually much more complex and i also don't understand this in detail. That's the rough plan, though.

Johannes Schaub - litb
You didn't mention how to link it...but I think the rest of the detail more than makes up for it.
Harper Shelby
As usual, a very thoughtful, lucid, and comprehensive answer. +1 only because there isn't a +3.
Robert Gamble
Linking or code generation doesn't really have anything to do with #includes though. The trickiest part about #include is how straightforward it is, and how code generation and linking is completely separated from this.
jalf
Liked the answer much. :)
mahesh
Harper, i thought it doesn't really fit that question. but now it happened a question fit that: http://stackoverflow.com/questions/417876/how-do-i-source-link-external-functions-in-c-or-c#417973 which wanted to know how to link :)
Johannes Schaub - litb
+8  A: 

There are several aspects involved here.

First, header files. The compiler simply includes the content of the file at the location where it was included, nothing more. As far as I know, GCC doesn't even treat standard header files differently (but I might be wrong there).

However, header files might actually not contain the implementation, only its declaration. If the implementation is located somewhere else, you've got to tell the compiler/linker that. By default, you do this by simply passing the appropriate library files to the compiler, or by passing a library name. For example, the following two are equivalent (provided that libcurl.a resides in a directory where it can be found by the linker):

gcc codefile.c -lcurl
gcc codefile.c /path/to/libcurl.a

This tells the link editor (“linker”) to link your code file against the implementation of the static library libcurl.a (the compiler gcc actually ignores these arguments because it doesn't know what to do with them, and simply passes them on to the linker). However, this is called static linking. There's also dynamic linking, which takes place at startup of your program, and which happens with .dlls under Windows (whereas static libraries correspond to .lib files on Windows). Dynamic library files under Linux usually have the file extension .so.

The best way to learn more about these files is to familiarize yourself with the GCC linker, ld, as well as the excellent toolset binutils, with which you can edit/view library files effortlessly (any binary code files, really).

Konrad Rudolph
gcc does treat standard headers differently, but mostly to ignore some warnings on them; see http://gcc.gnu.org/onlinedocs/cpp/System-Headers.html for the details.
CesarB