tags:

views:

197

answers:

2

On Solaris 9 and 10, both x86 and Sparc, we have a process that is hanging during exit:

fe0b5994 lwp_park (0, 0, 0)
fe0b206c slow_lock (ff388908, fe080400, 0, 0, 98, fe0abe00) + 58
ff376aa8 __deregister_frame_info_bases (2a518, 1, 0, 2daf0, 0, ff376be4) + 4c
00014858 ???????? (0, ff000000, 0, 0, 0, 0)
00019920 _fini    (0, 0, 210fc, fe21cbf0, 5, fe25897c) + 4
fe21cbf0 _exithandle (fee66a4c, 0, 40, 0, 0, fe2bc000) + 70
fe2a0564 exit     (0, fdefb47c, 40, fdefb8ff, 2c, 0) + 24
fee66a4c (our code) (4e280, 5ab5c, 5aa60, 2ed0, 81010100, fdefb988) + 244

Our code is compiled on the Solaris 9 machine, using gcc 3.4.6.

The process in question is a single-threaded child from a multi-threaded parent, forked but not execed.

Has anyone seen anything similar?

Do you know if a more recent version of gcc would fix the problem?

A: 

This is exactly why you should always exec after fork in a MT process: you don't know what locks other threads held in the parent, and when you may need one of these locks. Here you need one at exit, but you can't get it because the thread that locked it doesn't exist in the child.

New version of GCC is somewhat unlikely to help you. Even if it does help, it's only a matter of time before you hit another lock like this.

Either fork before creating the first thread, or exec immediately after fork. These are really the only sensible choices.

Employed Russian
I think you're right - I've changed the common path to run /bin/true instead of a normal exit - of course still have a problem if execl() fails.
Douglas Leeder
+1  A: 

You could try calling _exit() to exit the child process, rather than exit(). exit() is a library function that does various forms of library cleanup before exiting--for example, it flushes stdio buffers to disk. _exit() is the actual system call which terminates the process. Even in single-threaded programs, you normally used _exit() inside forked children to prevent library cleanup from happening twice.

Kenster
Good idea - I've changed all the exit calls in that section of code to _exit.
Douglas Leeder
The official Single UNIX specification says to use _exit() rather than exit() after a fork() - http://www.opengroup.org/pubs/online/7908799/xsh/vfork.html
Douglas Leeder