views:

1642

answers:

3

I'm trying to debug a file descriptor leak in a Java webapp running in Jetty 7.0.1 on Linux.

The app had been happily running for a month or so when requests started to fail due to too many open files, and Jetty had to be restarted.

java.io.IOException: Cannot run program [external program]: java.io.IOException: error=24, Too many open files
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
    at java.lang.Runtime.exec(Runtime.java:593)
    at org.apache.commons.exec.launcher.Java13CommandLauncher.exec(Java13CommandLauncher.java:58)
    at org.apache.commons.exec.DefaultExecutor.launch(DefaultExecutor.java:246)

At first I thought the issue was with the code that launches the external program, but it's using commons-exec and I don't see anything wrong with it:

CommandLine command = new CommandLine("/path/to/command")
    .addArgument("...");
ByteArrayOutputStream errorBuffer = new ByteArrayOutputStream();
Executor executor = new DefaultExecutor();
executor.setWatchdog(new ExecuteWatchdog(PROCESS_TIMEOUT));
executor.setStreamHandler(new PumpStreamHandler(null, errorBuffer));
try {
    executor.execute(command);
} catch (ExecuteException executeException) {
    if (executeException.getExitValue() == EXIT_CODE_TIMEOUT) {
        throw new MyCommandException("timeout");
    } else {
        throw new MyCommandException(errorBuffer.toString("UTF-8"));
    }
}

Listing open files on the server I can see a high number of FIFOs:

# lsof -u jetty
...
java    524 jetty  218w  FIFO        0,6      0t0 19404236 pipe
java    524 jetty  219r  FIFO        0,6      0t0 19404008 pipe
java    524 jetty  220r  FIFO        0,6      0t0 19404237 pipe
java    524 jetty  222r  FIFO        0,6      0t0 19404238 pipe

when Jetty starts there are just 10 FIFOs, after a few days there are hundreds of them.

I know it's a bit vague at this stage, but do you have any suggestions on where to look next, or how to get more detailed info about those file descriptors?

+3  A: 

Your external program does not behave properly. Have a look at why it doesn't do that.

Thorbjørn Ravn Andersen
So you're saying it's the external program that leaks file descriptors? But why does stopping Jetty clear up all the FIFOs then?
Mirko Nasato
No, the fact that it doesn't finish properly, so the descriptors assigned to it (stdin, stdout, stderr) cannot be released. It is a known problem with Hudson too.
Thorbjørn Ravn Andersen
It can indeed terminate abnormally in some cases, and I can't avoid that: it depends on user input. But that may well be the issue then.
Mirko Nasato
Here's the Hudson issue: https://hudson.dev.java.net/issues/show_bug.cgi?id=715
Mirko Nasato
A: 

As you are running on Linux I suspect you are running out of file descriptors. Check out ulimit. Here is an article that describes the problem: http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/

Greg Smith
This might get you up to 2 months of run-time! (This is a band-aid at best.)
Greg D
You've re-iterated the symptom but not a real cure.
Carl Smotricz
In fact I did raise the limit in `/etc/security/limits.conf` already for other reasons.
Mirko Nasato
+2  A: 

Don't know the nature of your app, but I have seen this error manifested multiple times because of a connection pool leak, so that would be worth checking out. On Linux, socket connections consume file descriptors as well as file system files. Just a thought.

alasdairg