tags:

views:

3803

answers:

6

I have a defunct process on my system:

abc      22093 19508  0 23:29 pts/4    00:00:00 grep ProcA
abc      31756     1  0 Dec08 ?        00:00:00 [ProcA_my_collect] <defunct>

How can I kill the above process, without a reboot of the machine? I have tried with

kill -9 31756
sudo kill -9 31756
A: 

You're probably not going to be able to if killing the parent doesn't resolve it. For whatever reason the systems isn't collecting that zombie process.

FWIW, I've seen it quite a bit on the SCO Openserver boxen that I used to administer. Heavy multi-user usage and low system resources, but it didn't seem to hurt anything. Just annoyed me. :)

Brian Knoblauch
A: 

Did you check for a child process that may need to be killed first? Sometimes the jam up is down the line... Try ps -ef --forest

to see what may be below it (if anything) then kill that first, then the one you already know about

curtisk
+6  A: 

You have killed the process, but a dead process doesn't disappear from the process table until its parent process performs a task called "reaping" (essentially calling wait(3) for that process to read its exit status). Dead processes that haven't been reaped are called "zombie processes."

The parent process id you see for 31756 is process id 1, which always belongs to init. That process does not reap its dead child processes. So they will remain zombies in the process table until you reboot.

Bill Karwin
Init certainly reaps dead children. Think about the implications of that not being true: Every child process would have to terminate before its parent.
janm
Restarting init may help - on Linux, telinit u should be able to do that.
bdonlan
+1  A: 

If kill -9 fails to kill a process the cause is almost always a driver or operating system bug.

The init process has adopted the process, but it cannot reap it. That is to say: when init calls wait(2) that process is not returned. One of the primary purposes of init is to reap dead orphaned children, so the problem is not that its parent died before it was reaped. Think: Otherwise, who reaps the results of a nohup'd process after logout?

Killing children of the defunct process is unlikely to help unless they are somehow related to the particular bug you are seeing.

janm
A: 

UID PID PPID C STIME TTY TIME CMD
"usename" 70986 1 23 14:41:59 - 250:37 [ksh]

I have this child process running on my UNIX server. The PPID for this process is "1" so and there is no TTY attached to this process. So I am not able to kill it...

kill -9 70986 completes with retun code ZERO but does not do anything...

Worst thing is that it is cloking CPU time (259:37) while doing nothing. Is there no other alternative that re-starting the server to get rid of this process.

For doing a restart, it needs a lot of approvals and complex procedures.

Honestly speaking, I was disappointed to see this shortcoming of a powerful OS like UNIX.

Idris.

Idris
A: 

The process probably hangs in e.g. ignoring signals like SIGPIPE, check with "strace -p " what is happening here

tolazy