views:

10306

answers:

21

When my c++ app crashes I would like to generate a stacktrace.

I already asked this but I guess I needed to clarify my needs.

My app is being run by many different users and it also runs on Linux, Windows and Macintosh ( all versions are compiled using gcc ).

I would like my program to be able to generate a stack trace when it crashes and the next time the user run's it, it will ask them if it is ok to send the stack trace to me so I can track down the problem. I can handle the sending the info to me but I don't know how to generate the trace string. Any ideas?

+2  A: 
ulimit -c unlimited

is a system variable, wich will allow to create a core dump after your application crashes. In this case an unlimited amount. Look for a file called core in the very same directory. Make sure you compiled your code with debugging informations enabled!

regards

mana
The user is not asking for a core dump. He's asking for a stack trace. See http://www.delorie.com/gnu/docs/glibc/libc_665.html
tgamblin
a core dump will contain the call stack at the moment of the crash, won't it?
Mo
You're assuming he's on Unix, and using Bash.
Paul Tomblin
+1  A: 

Usually when your application crashes, and you have the core files enabled, you'll get a core dump, which you can analyze with gdb. To do this, you have to enable core dumps:

$ ulimit -c unlimited

Also, if you run your app using gdb, the app will halt if a signal is received. You can just get the stack trace with the gdb command bt.

terminus
Missing the point of the question.
Jonathan Leffler
+1  A: 
  • Compile your code using the -g flag to include debug symbols in the binary.
  • Set up your system so that it core files are produced when applications crash (e.g. ulimit -c unlimited).
  • When an application crashes, you can use the core file in a debugger (such as gdb, by running, for example, gdb ./core) to get a backtrace (gdb command: bt).

Note that C++ symbol names are sometimes pretty garbled and the backtrace will probably be somewhat incomprehensible.

More helpful backtraces will probably need evil trickery (one solution I've heard of requires that you add a special macro to the beginning of all methods that you write).

Jan Krüger
Missing the point of the question.
Jonathan Leffler
+4  A: 

Some versions of libc contain functions that deal with stack traces; you might be able to use them:

http://www.gnu.org/software/libc/manual/html_node/Backtraces.html

I remember using libunwind a long time ago to get stack traces, but it may not be supported on your platform.

Stephen Deken
+6  A: 

You did not specify your operating system, so this is difficult to answer. If you are using a system based on gnu libc, you might be able to use the libc function backtrace().

GCC also has two builtins that can assist you, but which may or may not be implemented fully on your architecture, and those are __builtin_frame_address and __builtin_return_address. Both of which want an immediate integer level (by immediate, I mean it can't be a variable). If __builtin_frame_address for a given level is non-zero, it should be safe to grab the return address of the same level.

Brian Mitchell
+2  A: 

Look at:

man 3 backtrace

And:

#include <exeinfo.h>
int backtrace(void **buffer, int size);

These are GNU extensions.

Stéphane
There may be additional examples to help out on this page I created a while back: http://charette.no-ip.com:81/programming/2010-01-25_Backtrace/
Stéphane
+2  A: 

It's important to note that once you generate a core file you'll need to use the gdb tool to look at it. For gdb to make sense of your core file, you must tell gcc to instrument the binary with debugging symbols: to do this, you compile with the -g flag:

$ g++ -g prog.cpp -o prog

Then, you can either set "ulimit -c unlimited" to let it dump a core, or just run your program inside gdb. I like the second approach more:

$ gdb ./prog
... gdb startup output ...
(gdb) run
... program runs and crashes ...
(gdb) where
... gdb outputs your stack trace ...

I hope this helps.

Benson
You can also call `gdb` right from your crashing program. Setup handler for SIGSEGV, SEGILL, SIGBUS, SIGFPE that will call gdb. Details: http://stackoverflow.com/questions/3151779/how-its-better-to-invoke-gdb-from-program-to-print-its-stacktrace The advantage is that you get beautiful, annotated backtrace like in `bt full`, also you can get stack traces of all threads.
Vi
+1  A: 

I would use the code that generates a stack trace for leaked memory in Visual Leak Detector. This only works on Win32, though.

Jim Buck
+2  A: 

I can help with the Linux version: the function backtrace, backtrace_symbols and backtrace_symbols_fd can be used. See the corresponding manual pages.

terminus
A: 

If your program crashes, it's the operating system itself that generates crash dump information. If you're using a *nix OS, you simply need to not prevent it from doing so (check out the ulimit command's 'coredump' options).

nsayer
A: 

*nix: you can intercept SIGSEGV (usualy this signal is raised before crashing) and keep the info into a file. (besides the core file which you can use to debug using gdb for example).

win: Check this from msdn.

You can also look at the google's chrome code to see how it handles crashes. It has a nice exception handling mechanism.

Iulian Şerbănoiu
A: 

On Linux/unix/MacOSX use core files (you can enable them with ulimit or compatible system call). On Windows use Microsoft error reporting (you can become a partner and get access to your application crash data).

Kasprzol
+2  A: 

ulimit -c sets the core file size limit on unix. By default, the core file size limit is 0. You can see your ulimit values with ulimit -a.

also, if you run your program from within gdb, it will halt your program on "segmentation violations" (SIGSEGV, generally when you accessed a piece of memory that you hadn't allocated) or you can set breakpoints.

ddd and nemiver are front-ends for gdb which make working with it much easier for the novice.

Core dumps are infinitely more useful than stack traces because you can load the core dump in the debugger and see the state of the whole program and its data at the point of the crash.
Adam Hawes
The backtrace facility that others have suggested is probably better than nothing, but it is very basic -- it doesn't even give line numbers. Using core dumps, on the other hand, let's you retroactively view the entire state of your application at the time it crashed (including a detailed stack trace). There *might* be practical issues with trying to use this for field debugging, but it is definitely a more powerful tool for analyzing crashes and asserts during development (at least on Linux).
nobar
+1  A: 

win: How about StackWalk64 http://msdn.microsoft.com/en-us/library/ms680650.aspx

Roskoto
A: 

I forgot about the GNOME tech of "apport", but I don't know much about using it. It is used to generate stacktraces and other diagnostics for processing and can automatically file bugs. It's certainly worth checking in to.

+42  A: 

For Linux and I believe Mac OS X, if you're using gcc, or any compiler that uses glibc, you can use the backtrace() functions in execinfo.h to print a stacktrace and exit gracefully when you get a segmentation fault. Documentation can be found in the libc manual.

Here's an example program that installs a SIGSEGV handler and prints a stacktrace to stderr when it segfaults. The baz() function here causes the segfault that triggers the handler:

#include <stdio.h>
#include <execinfo.h>
#include <signal.h>
#include <stdlib.h>


void handler(int sig) {
  void *array[10];
  size_t size;

  // get void*'s for all entries on the stack
  size = backtrace(array, 10);

  // print out all the frames to stderr
  fprintf(stderr, "Error: signal %d:\n", sig);
  backtrace_symbols_fd(array, size, 2);
  exit(1);
}

void baz() {
 int *foo = (int*)-1; // make a bad pointer
  printf("%d\n", *foo);       // causes segfault
}

void bar() { baz(); }
void foo() { bar(); }


int main(int argc, char **argv) {
  signal(SIGSEGV, handler);   // install our handler
  foo(); // this will call foo, bar, and baz.  baz segfaults.
}

Compiling with -g -rdynamic gets you symbol info in your output, which glibc can use to make a nice stacktrace:

$ gcc -g -rdynamic ./test.c -o test

Executing this gets you this output:

$ ./test
Error: signal 11:
./test(handler+0x19)[0x400911]
/lib64/tls/libc.so.6[0x3a9b92e380]
./test(baz+0x14)[0x400962]
./test(bar+0xe)[0x400983]
./test(foo+0xe)[0x400993]
./test(main+0x28)[0x4009bd]
/lib64/tls/libc.so.6(__libc_start_main+0xdb)[0x3a9b91c4bb]
./test[0x40086a]

This shows the load module, offset, and function that each frame in the stack came from. Here you can see the signal handler on top of the stack, and the libc functions before main in addition to main, foo, bar, and baz.

tgamblin
There's also /lib/libSegFault.so which you can use with LD_PRELOAD.
CesarB
It looks like the first two entries in your backtrace output contain a return address inside the signal handler and probably one inside `sigaction()` in libc. While your backtrace appears to be correct, I have sometimes found that additional steps are necessary to ensure the actual location of the fault appears in the backtrace as it can be overwritten with `sigaction()` by the kernel.
jschmier
+1  A: 

See the Stack Trace facility in ACE (ADAPTIVE Communication Environment). It's already written to cover all major platforms (and more). The library is BSD-style licensed so you can even copy/paste the code if you don't want to use ACE.

Adam Mitz
+5  A: 

Might be worth looking at Google Breakpad, a cross-platform crash dump generator and tools to process the dumps.

Simon Steele
+3  A: 

Ive been looking at this problem for a while.

And buried deep in the Google Performance Tools README

http://code.google.com/p/google-perftools/source/browse/trunk/README

talks about libunwind

http://www.nongnu.org/libunwind/

Would love to hear opinions of this library.

The problem with -rdynamic is that it can increase the size of the binary relatively significantly in some cases

Gregory
On x86/64, I have not seen -rdynamic increase binary size much. Adding -g makes for a much bigger increase.
Dan
+12  A: 

Linux

While the use of the backtrace() functions in execinfo.h to print a stacktrace and exit gracefully when you get a segmentation fault has already been suggested, I see no mention of the intricacies necessary to ensure the resulting backtrace points to the actual location of the fault (at least for some architectures - x86 & ARM).

The first two entries in the stack frame chain when you get into the signal handler contain a return address inside the signal handler and one inside sigaction() in libc. The stack frame of the last function called before the signal (which is the location of the fault) is lost.

Code

#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#ifndef __USE_GNU
#define __USE_GNU
#endif

#include <execinfo.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ucontext.h>
#include <unistd.h>

/* This structure mirrors the one found in /usr/include/asm/ucontext.h */
typedef struct _sig_ucontext {
 unsigned long     uc_flags;
 struct ucontext   *uc_link;
 stack_t           uc_stack;
 struct sigcontext uc_mcontext;
 sigset_t          uc_sigmask;
} sig_ucontext_t;

void crit_err_hdlr(int sig_num, siginfo_t * info, void * ucontext)
{
 void *             array[50];
 void *             caller_address;
 char **            messages;
 int                size, i;
 sig_ucontext_t *   uc;

 uc = (sig_ucontext_t *)ucontext;

 /* Get the address at the time the signal was raised from the EIP (x86) */
 caller_address = (void *) uc->uc_mcontext.eip;   

 fprintf(stderr, "signal %d (%s), address is %p from %p\n", 
  sig_num, strsignal(sig_num), info->si_addr, 
  (void *)caller_address);

 size = backtrace(array, 50);

 /* overwrite sigaction with caller's address */
 array[1] = caller_address;

 messages = backtrace_symbols(array, size);

 /* skip first stack frame (points here) */
 for (i = 1; i < size && messages != NULL; ++i)
 {
  fprintf(stderr, "[bt]: (%d) %s\n", i, messages[i]);
 }

 free(messages);

 exit(EXIT_FAILURE);
}

int crash()
{
 char * p = NULL;
 *p = 0;
 return 0;
}

int foo4()
{
 crash();
 return 0;
}

int foo3()
{
 foo4();
 return 0;
}

int foo2()
{
 foo3();
 return 0;
}

int foo1()
{
 foo2();
 return 0;
}

int main(int argc, char ** argv)
{
 struct sigaction sigact;

 sigact.sa_sigaction = crit_err_hdlr;
 sigact.sa_flags = SA_RESTART | SA_SIGINFO;

 if (sigaction(SIGSEGV, &sigact, (struct sigaction *)NULL) != 0)
 {
  fprintf(stderr, "error setting signal handler for %d (%s)\n",
    SIGSEGV, strsignal(SIGSEGV));

  exit(EXIT_FAILURE);
 }

 foo1();

 exit(EXIT_SUCCESS);
}

Output

signal 11 (Segmentation fault), address is (nil) from 0x8c50
[bt]: (1) ./test(crash+0x24) [0x8c50]
[bt]: (2) ./test(foo4+0x10) [0x8c70]
[bt]: (3) ./test(foo3+0x10) [0x8c8c]
[bt]: (4) ./test(foo2+0x10) [0x8ca8]
[bt]: (5) ./test(foo1+0x10) [0x8cc4]
[bt]: (6) ./test(main+0x74) [0x8d44]
[bt]: (7) /lib/libc.so.6(__libc_start_main+0xa8) [0x40032e44]

All the hazards of calling the backtrace() functions in a signal handler still exist and should not be overlooked, but I find the functionality I described here quite helpful in debugging crashes.

It is important to note that the example I provided is developed/tested on Linux for x86. I have also successfully implemented this on ARM using uc_mcontext.arm_pc instead of uc_mcontext.eip.

Here's a link to the article where I learned the details for this implementation: http://www.linuxjournal.com/article/6391

jschmier
On systems using GNU ld, remember to compile with `-rdynamic` to instruct the linker to add all symbols, not only used ones, to the dynamic symbol table. This allows `backtrace_symbols()` to convert addresses to function names
jschmier
The output in the example above was taken from an test program compiled using a gcc-3.4.5-glibc-2.3.6 cross-toolchain and executed on an ARMv6-based platform running Linux Kernel 2.6.22.
jschmier
+3  A: 

Even though a correct answer has been provided that describes how to use the GNU libc backtrace() function1 and I provided my own answer that describes how to ensure a backtrace from a signal handler points to the actual location of the fault2, I don't see any mention of demangling C++ symbols output from the backtrace.

When obtaining backtraces from a C++ program, the output can be run through c++filt1 to demangle the symbols.

  • 1 Linux & OS X
  • 2 Linux

The following C++ Linux example uses the same signal handler as my other answer and demonstrates how c++filt can be used to demangle the symbols.

Code:

class foo
{
public:
    foo() { foo1(); }

private:
    void foo1() { foo2(); }
    void foo2() { foo3(); }
    void foo3() { foo4(); }
    void foo4() { crash(); }
    void crash() { char * p = NULL; *p = 0; }
};

int main(int argc, char ** argv)
{
    // Setup signal handler for SIGSEGV
    ...

    foo * f = new foo();
    return 0;
}

Output (./test):

signal 11 (Segmentation fault), address is (nil) from 0x8048e07
[bt]: (1) ./test(crash__3foo+0x13) [0x8048e07]
[bt]: (2) ./test(foo4__3foo+0x12) [0x8048dee]
[bt]: (3) ./test(foo3__3foo+0x12) [0x8048dd6]
[bt]: (4) ./test(foo2__3foo+0x12) [0x8048dbe]
[bt]: (5) ./test(foo1__3foo+0x12) [0x8048da6]
[bt]: (6) ./test(__3foo+0x12) [0x8048d8e]
[bt]: (7) ./test(main+0xe0) [0x8048d18]
[bt]: (8) ./test(__libc_start_main+0x95) [0x42017589]
[bt]: (9) ./test(__register_frame_info+0x3d) [0x8048981]

Demangled Output (./test 2>&1 | c++filt):

signal 11 (Segmentation fault), address is (nil) from 0x8048e07
[bt]: (1) ./test(foo::crash(void)+0x13) [0x8048e07]
[bt]: (2) ./test(foo::foo4(void)+0x12) [0x8048dee]
[bt]: (3) ./test(foo::foo3(void)+0x12) [0x8048dd6]
[bt]: (4) ./test(foo::foo2(void)+0x12) [0x8048dbe]
[bt]: (5) ./test(foo::foo1(void)+0x12) [0x8048da6]
[bt]: (6) ./test(foo::foo(void)+0x12) [0x8048d8e]
[bt]: (7) ./test(main+0xe0) [0x8048d18]
[bt]: (8) ./test(__libc_start_main+0x95) [0x42017589]
[bt]: (9) ./test(__register_frame_info+0x3d) [0x8048981]
jschmier
It seems that demangling capabilities can also be included in a C/C++ application via `abi::__cxa_demangle`. - http://gcc.gnu.org/onlinedocs/libstdc++/manual/ext_demangling.html
jschmier