views:

192

answers:

2

I have a program that brings up and tears down multiple threads throughout its life. Everything works great for awhile, but eventually, I get the following core dump stack trace.

#0  0x009887a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x007617a5 in raise () from /lib/tls/libc.so.6
#2  0x00763209 in abort () from /lib/tls/libc.so.6
#3  0x003ec1bb in __gnu_cxx::__verbose_terminate_handler () from /usr/lib/libstdc++.so.6
#4  0x003e9ed1 in __cxa_call_unexpected () from /usr/lib/libstdc++.so.6
#5  0x003e9f06 in std::terminate () from /usr/lib/libstdc++.so.6
#6  0x003ea04f in __cxa_throw () from /usr/lib/libstdc++.so.6
#7  0x00d5562b in boost::thread::start_thread () from /h/Program/bin/../lib/libboost_thread-gcc34-mt-1_39.so.1.39.0

At first, I was leaking threads, and figured the core was due to hitting some maximum limit of number of current threads, but now it seems that this problems occurs even when I don't. For reference, in the core above there were 13 active threads executing.

I did some searching to try and figure out why start_thread would core, but I didn't come across anything. Anyone have any ideas?

+2  A: 

start_thread is throwing an uncaught exception, see which exceptions can start_thread throw and place a catch around it to see what is the problem.

Arkaitz Jimenez
After doing some digging, it seem that thread_resource_error is an exception that can be thrown. If I am not leaking threads, why would this be thrown?
Craig H
Is that exception being catched by you now? It can be that you are running out of memory for thread stacks, or maybe running out of file descriptors. Or maybe those threads that already finished don't free the resources till you `join` them or the program finishes and you aren't doing that...
Arkaitz Jimenez
+2  A: 

What are the values carried by thread_resource_error? It looks like you can call native_error() to find out.

Since this is a wrapper around pthreads there are only a couple of possibilities - EAGAIN, EINVAL and EPERM. It looks as if boost has exceptions it would likely throw for EINVAL and EPERM - i.e. unsupported_thread_option() and thread_permission_error().

That pretty much leaves EAGAIN so I would double check that you really aren't exceeding the system limits on the number of threads. You are sure you are joining them, or if detached, they are really gone?

Duck