views:

741

answers:

5

I have a Java program which is being started via ProcessBuilder from another Java program. System.exit(0) is called from the child program, but for some of our users (on Windows) the java.exe process associated with the child doesn't terminate. The child program has no shutdown hooks, nor does it have a SecurityManager which might stop System.exit() from terminating the VM. I can't reproduce the problem myself on Linux or Windows Vista. So far, the only reports of the problem come from two Windows XP users and one Vista user, using two different JREs (1.6.0_15 and 1.6.0_18), but they're able to reproduce the problem every time.

Can anyone suggest reasons why the JVM would fail to terminate after System.exit(), and then only on some machines?

Edit 1: I got the user to install the JDK so we could get a thread dump from the offending VM. What the user told me is that the VM process disappears from VisualVM as soon as he clicks on the 'Quit' item in my menu---but, according to Windows Task Manager, the process hasn't terminated, and no matter how long the user waits (minutes, hours), it never terminates.

Edit 2: I have confirmed now that Process.waitFor() in the parent program never returns for at least one of the users having the problem. So, to summarize: The child VM seems to be dead (VisualVM doesn't even see it) but the parent still sees the process as live and so does Windows.

+1  A: 

Maybe a badly written finalizer? A shutdown hook was my first thought when I read the subject line. Speculation: would a thread that catches InterruptedException and keeps on running anyway hold up the exit process?

It seems to me that if the problem is reproducible, you should be able to attach to the JVM and get a thread list/stack trace that shows what is hung up.

Are you sure that the child is still really running and that it's not just an unreaped zombie process?

Chris Dolan
If it's a finalizer, then it's not one in our code---we don't have any. I'm sure the child is still running, since we see two Java processes and the user doesn't have any other programs on his system which use Java. I was kind of hoping to avoid asking him to install the JDK so we could use jstack, but I think now that might be the best way to proceed.
uckelman
What I meant about the child process is, does it have any threads left? Is it still responding to input? Does it still have open handles? I'm more familiar with Unix than Windows, but I know that processes aren't cleared from the Unix OS until the parent process reaps them and I think it's the same on Windows. Just because it shows up in top or TaskManager doesn't mean it's still active.Of course, that doesn't explain why it only happens to some users...
Chris Dolan
By default finalizers are not run on exit; see http://java.sun.com/javase/6/docs/api/java/lang/System.html#runFinalizersOnExit(boolean)
Stephen C
@Stephen C: ah, of course you're right. My bad.
Chris Dolan
+1  A: 

Does the parent process consumes the error- and outputstream from the child process? If under some OS the childprocess print out some errors/warning on stdout/stderr and the parent process is not consuming the streams, the childprocess will block and not reach System.exit();

Michael Konietzka
The parent process has one thread dedicated to consuming each of the child's STDOUT and STDERR (which passes that output through to a log file). So far as I can see, those are working properly, since we're seeing all the output we expect to see in the log.
uckelman
+2  A: 

Here are a couple of scenarios...

Per the definition of a Thread in http://java.sun.com/j2se/1.4.2/docs/api/java/lang/Thread.html

...

When a Java Virtual Machine starts up, there is usually a single non-daemon thread (which typically calls the method named main of some designated class). The Java Virtual Machine continues to execute threads until either of the following occurs:

1) The exit method of class Runtime has been called and the security manager has permitted the exit operation to take place. 2) All threads that are not daemon threads have died, either by returning from the call to the run method or by throwing an exception that propagates beyond the run method.

Another possibility is if the method runFinalizersOnExit has been called. as per the documentation in http://java.sun.com/j2se/1.4.2/docs/api/java/lang/System.html Deprecated. This method is inherently unsafe. It may result in finalizers being called on live objects while other threads are concurrently manipulating those objects, resulting in erratic behavior or deadlock. Enable or disable finalization on exit; doing so specifies that the finalizers of all objects that have finalizers that have not yet been automatically invoked are to be run before the Java runtime exits. By default, finalization on exit is disabled. If there is a security manager, its checkExit method is first called with 0 as its argument to ensure the exit is allowed. This could result in a SecurityException.

Romain Hippeau
We don't have any calls to `runFinalizersOnExit`, nor do we have a `SecurityManager`, so I think these are not the cause.
uckelman
A: 

I think all of the obvious causes have been provisionally covered; e.g. finalizers, shutdown hooks, not correctly draining standard output / standard error in the parent process. You now need more evidence to figure what is going on.

Suggestions:

  1. Set up a Windows XP or Vista machine (or virtual), install the relevant JRE and your app, and try to reproduce the problem. Once you can reproduce the problem, either attach a debugger or send the relevant signal to get a thread dump to standard error.

  2. If you cannot reproduce the problem as above, get one of your users to take a thread dump, and forward you the log file.

Stephen C
I can't reproduce it myself on Vista, and the user can't get a thread dump from VisualVM, as the VM seems to be no longer alive.
uckelman
+1  A: 

The parent process has one thread dedicated to consuming each of the child's STDOUT and STDERR (which passes that output through to a log file). So far as I can see, those are working properly, since we're seeing all the output we expect to see in the log

i had a similar problem with my program not disappearing from task mgr when i was consuming the stdout/stderr. in my case, if I closed the stream that was listening before calling system.exit() then the javaw.exe hung around. strange, it wasn't writing to the stream...

the solution in my case was to simply flush the stream rather than close it before existing. of course, you could always flush and then redirect back to stdout and stderr before exit.

Andrew McVeigh