When writing C/C++ code, in order to debug the binary executable the debug option must be enabled on the compiler/linker. In the case of GCC, the option is -g. When the debug option is enabled, how does the affect the binary executable? What additional data is stored in the file that allows the debugger function as it does?
-g tells the compiler to store symbol table information in the executable. Among other things, this includes:
- symbol names
- type info for symbols
- files and line numbers where the symbols came from
Debuggers use this information to output meaningful names for symbols and to associate instructions with particular lines in the source.
For some compilers, supplying -g will disable certain optimizations. For example, icc sets the default optimization level to -O0 with -g unless you explicitly indicate -O[123]. Also, even if you do supply -O[123], optimizations that prevent stack tracing will still be disabled (e.g. stripping frame pointers from stack frames. This has only a minor effect on performance).
With some compilers, -g will disable optimizations that can confuse where symbols came from (instruction reordering, loop unrolling, inlining etc). If you want to debug with optimization, you can use -g3 with gcc to get around some of this. Extra debug info will be included about macros, expansions, and functions that may have been inlined. This can allow debuggers and performance tools to map optimized code to the original source, but it's best effort. Some optimizations really mangle the code.
For more info, take a look at DWARF, the debugging format originally designed to go along with ELF (the binary format for Linux and other OS's).
A symbol table is added to the executable which maps function/variable names to data locations, so that debuggers can report back meaningful information, rather than just pointers. This doesn't effect the speed of your program, and you can remove the symbol table with the 'strip' command.
There is some overlap with this question which covers the issue from the other side.
Just as a matter of interest, you can crack open a hexeditor and take a look at an executable produced with -g
and one without. You can see the symbols and things that are added. It may change the assembly (-S
) too, but I'm not sure.
-g adds debugging information in the executable, such as the names of variables, the names of functions, and line numbers. This allows a debugger, such as gdb to step through code line by line, set breakpoints, and inspect the values of variables. Because of this additional information using -g increases the size of the executable.
Also, gcc allows to use -g together with -O flags, which turn on optimization. Debugging an optimized executable can be very tricky, because variables may be optimized away, or instructions may be executed in a different order. Generally, it is a good idea to turn off optimization when using -g, even though it results in much slower code.
In addition to the debugging and symbol information
Google DWARF (A Developer joke on ELF)
By default most compiler optimizations are turned off when debugging is enabled.
So the code is the pure translation of the source into Machine Code rather than the result of many highly specialized transformations that are applied to release binaries.
But the most important difference (in my opinion)
Memory in Debug builds is usually initialized to some compiler specific values to facilitate debugging. In release builds memory is not initialized unless explicitly done so by the application code.
Check your compiler documentation for more information:
But an example for DevStudio is:
- 0xCDCDCDCD Allocated in heap, but not initialized
- 0xDDDDDDDD Released heap memory.
- 0xFDFDFDFD "NoMansLand" fences automatically placed at boundary of heap memory. Should never be overwritten. If you do overwrite one, you're probably walking off the end of an array.
- 0xCCCCCCCC Allocated on stack, but not initialized