views:

1172

answers:

3

I have a Java and Tomcat-based server application which initiates many outbound HTTP requests to other web sites. We use Jakarta's HTTP Core/Client libraries, very latest versions.

The server locks up at some point since all its worker threads are stuck trying to close completed HTTP connections. Using 'lsof' reveals a bunch of sockets stuck in TCP CLOSE_WAIT state.

This doesn't happen for all, or even most connections. In fact, I saw it before and resolved it by making sure to set the Connection: Close response header. So that makes me think it may be bad behavior of remote servers.

It may have come up again since I moved the app to a totally new service provider -- different OS, network situation.

But, I am still at a loss as to what I could do, if anything, to work around this. Some poking around on the internet didn't turn up anything I'm not already doing. Just thought I'd ask if anyone has seen and solved this?

A: 

I'm not sure how much you know about TCP. A TCP client ends up in the CLOSE_WAIT state when it is in the ESTABLISHED state and receives a FIN packet. The CLOSE_WAIT state means that it is waiting to receive a close command from the application layer - in this case, that means it's waiting for close() to be called on the socket.

So, my guess would be that calling close() in the worker threads will fix the problem. Or are you already doing this?

Matt Ball
Yeah I get that much, though I think we're using the Jakarta libraries correctly and it works properly on almost all connections. But I agree, this seems like the part to dig into. I'll look at whether these connections are really being closed by the library.
Sean Owen
What differences do you see between connections that hang in `CLOSE_WAIT` and those that don't? Is it consistently the same ones?
Matt Ball
Yes, it's to the same couple hosts. I believe it must be something to do with how the remote host treats the connection, but what it is I don't know, nor how to work around it. I do see the connection is always closed on the Java side from the app layer.
Sean Owen
A: 

I believe I might have found a solution -- at least, these changes, together, seem to have made the problem go away.

  • Call HttpEntity.consumeContent() after you're done reading from its InputStream, to double-check the content is consumed and the framework releases the connection
  • For good measure, I call ClientConnectionManager.closeExpiredConnections() and ClientConnectionManager.closeIdleConnections(0L, TimeUnit.MILLISECONDS) just after this, to press the framework to release anything it's done with right away. This might be overkill.
Sean Owen
A: 

OK the above didn't quite do the trick. In the end, I had to use a SingleClientConnManager, and create and shut it down for every request. This did the trick. I assume that this is what's needed to make sure the connection is closed.

Sean Owen