tags:

views:

3462

answers:

4

I often hear the terms 'statically linked' and 'dynamically linked', often in reference to code written in C(++|#) but I don't know much of anything about either, what are they, what exactly are they talking about, and what are they linking?

+8  A: 

Statically linked libraries are linked in at compile time. Dynamically linked libraries are loaded at run time. Static linking bakes the library bit into your executable. Dynamic linking only bakes in a reference to the library; the bits for the dynamic library exist elsewhere and could be swapped out later.

John D. Cook
+37  A: 

There are (in most cases, discounting interpreted code) two stages in getting from source code (what you write) to executable code (what you run).

The first is compilation which turns source code into object modules.

The second, linking, is what combines object modules together to form an executable.

The distinction is made for, among other things, allowing third party libraries to be included in your executable without you seeing their source code. Examples are libraries for database access, network communications and graphical user interfaces.

In all these cases, you generally only need to concern yourself with the interfaces to those libraries, not the inner workings of them (unless they have bugs, of course).

When you statically link a file into an executable, the contents of that file are included at link time. In other words, the contents of the file are physically inserted into the executable.

When you dynamically link, a pointer (the file name of the file, for example) is included in the executable and the contents are not included at link time. It's only when you run the executable that these dynamically linked files are bought in and they're only bought into the in-memory copy of the executable, not the one on disk.

It's basically a method of deferred linking. There's an even more deferred method (called late binding on some systems) that won't bring in the dynamically linked file until you actually try to call a function within it.

Statically-linked files are 'locked' to the executable at link time so they never change. A dynamically linked file referenced by an executable can change just by replacing the file on the disk.

This allows updates to functionality without having to re-link the code; the loader re-links every time you run it.

This is both good and bad - on one hand, it allows easier updates and bug fixes, on the other it can lead to programs ceasing to work if the updates are incompatible.


As an example, let's look at the case of a user compiling their main.c file for static and dynamic linking.

Phase     Static                    Dynamic
--------  ----------------------    ------------------------
          +---------+               +---------+
          | main.c  |               | main.c  |
          +---------+               +---------+
Compile........|.........................|...................
          +---------+ +---------+   +---------+ +--------+
          | main.o  | | crtlib  |   | main.o  | | crtimp |
          +---------+ +---------+   +---------+ +--------+
Link...........|..........|..............|...........|.......
               |          |              +-----------+
               |          |              |
          +---------+     |         +---------+ +--------+
          |  main   |-----+         |  main   | | crtdll |
          +---------+               +---------+ +--------+
Load/Run.......|.........................|..........|........
          +---------+               +---------+     |
          | main in |               | main in |-----+
          | memory  |               | memory  |
          +---------+               +---------+

You can see in the static case that the main program and C runtime library are linked together at link time (by the developers). Sine the user typically cannot relink the executable, they're stuck with the behaviour.

In the dynamic case, the main program is linked with the C runtime import library (something which declares what's in the dynamic library but doesn't actually define it). This allows the linker to link even though code is missing.

Then, at runtime, the operating system loader does a late linking of the main program with the C runtime DLL (dynamic link library or shared library or other nomenclature).

The owner of the C runtime can drop in a new DLL at any time to provide updates or bug fixes. As stated earlier, this has both advantages and disadvantages.

paxdiablo
Please correct me if I'm wrong, but on Windows, software tends to include its own libraries with the install, even if they're dynamically linked. On many Linux systems with a package manager, many dynamically linked libraries ("shared objects") are actually shared between software.
Paul Fisher
Nice answer, just what I was looking for.
Unkwntech
@PaulF: things like the Windows common controls, DirectX, .NET and so on ship a lot with the applications whereas on Linux, you tend to use apt or yum or something like that to manage dependencies - so you're right in that sense. Win Apps that ship their *own* code as DLLs tend not to share them.
paxdiablo
+1 for sheer excellence.
jprete
It is true to update just the dynamic library if JUST functionality is updated. What about new exposed functions or modified signatures are present in the DLL? I think you need to link again.
Sunscreen
There's a special place reserved in the ninth circle of hell for those that update their DLLs and break backward compatibility. Yes, if interfaces disappear or are modified, then the dynamic linking will fall in a heap. That's why it shouldn't be done. By all means add a function2() to your DLL but don't change function() if people are using it. Best way to handle that is to recode function() in such a way the it calls function2(), but _don't_ change the signature of function().
paxdiablo
Thanks for clarifying
Sunscreen
A: 

(I don't know C# but it is interesting to have a static linking concept for a VM language)

Dynamic linking involves knowing how to find a required functionality which you only have a reference from your program. You language runtime or OS search for a piece of code on the filesystem, network or compiled code cache, matching the reference, and then takes several measures to integrate it to your program image in the memory, like relocation. They are all done at runtime. It can be done either manually or by the compiler. There is ability to update with a risk of messing up (namely, DLL hell).

Static linking is done at compile time that, you tell the compiler where all the functional parts are and instruct it to integrate them. There are no searching, no ambiguity, no ability to update without a recompile. All your dependencies are physically one with your program image.

artificialidiot
+32  A: 

I think a good answen to this question ought to explain what linking is.

When you compile some C code (for instance), it is translated to machine language. Just a sequence of bytes which when run causes the processor to add, subtract, compare, "goto", read memory, write memory, that sort of thing. This stuff is stored in object (.o) files.

Now, a long time ago, computer scientists invented this "subroutine" thing. Execute-this-chunk-of-code-and-return-here. It wasn't too long before they realised that the most useful subroutines could be stored in a special place and used by any program that needed them.

Now in the early days programmers would have to punch in the memory address that these subroutines were located at. Something like CALL 0x5A62. This was tedious and problematic should those memory addresses ever need to be changed.

So, the process was automated. You write a program that calls printf(), and the compiler doesn't know the memory address of printf. So the compiler just writes CALL 0x0000, and adds a note to the object file saying "must replace this 0x0000 with the memory location of printf".

Static linkage means that the linker program (the GNU one is called ld) adds printf's machine code directly to your executable file, and changes the 0x0000 to the address of printf. This happens when your executable is created.

Dynamic linkage means that the above step doesn't happen. The executable file still has a note that says "must replace 0x000 with the memory location of printf". The operating system's loader needs to find the printf code, load it into memory, and correct the CALL address, each time the program is run.

It's common for programs to call some functions which will be statically linked (standard library functions like printf are usually statically linked) and other functions which are dynamically linked. The static ones "become part" of the executable and the dynamic ones "join in" when the executable is run.

There are advantages and disadvantages to both methods, and there are differences between operating systems. But since you didn't ask, I'll end this here.

Artelius
I liked the explanation of linking
Nathan Fellman
I did too, however I only get to choose 1 answer.
Unkwntech
I would have chosen the other one too :P but thanks
Artelius
incredible answer, Artelius. :)
mahesh
Artelius, i am looking some in depth about your explanation about how these crazy low level things works. please reply with what books we must read to get indepth knowledge about the above things. thank you.
mahesh
Sorry, I can't suggest any books. You should learn assembly language first. Then Wikipedia can give a decent overview of such topics. You may want to look at the GNU `ld` documentation.
Artelius