views:

862

answers:

4

I've got a client-server tiered architecture with the client making RPC-like requests to the server. I'm using Tomcat to host the servlets, and the Apache HttpClient to make requests to it.

My code goes something like this:

    private static final HttpConnectionManager CONN_MGR = new MultiThreadedHttpConnectionManager();
    final GetMethod get = new GetMethod();
    final HttpClient httpClient = new HttpClient(CONN_MGR);
    get.getParams().setCookiePolicy(CookiePolicy.IGNORE_COOKIES);
    get.getParams().setParameter(HttpMethodParams.USER_AGENT, USER_AGENT);

    get.setQueryString(encodedParams);
    int responseCode;
    try {
        responseCode = httpClient.executeMethod(get);
    } catch (final IOException e) {
        ...
    }
    if (responseCode != 200)
        throw new Exception(...);

    String responseHTML;
    try {
        responseHTML = get.getResponseBodyAsString(100*1024*1024);
    } catch (final IOException e) {
        ...
    }
    return responseHTML;

It works great in a lightly-loaded environment, but when I'm making hundreds of requests per second I start to see this -

Caused by: java.net.BindException: Address already in use
    at java.net.PlainSocketImpl.socketBind(Native Method)
    at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:336)
    at java.net.Socket.bind(Socket.java:588)
    at java.net.Socket.<init>(Socket.java:387)
    at java.net.Socket.<init>(Socket.java:263)
    at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
    at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
    at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
    at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
    at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)

Any thoughts on how to fix this? I'm guessing it's something to do with the client trying to reuse the ephemeral client ports, but why is this happening / how can I fix it? Thanks!

+1  A: 

With hundreds of connections a second, and without knowing how long your connections keep to open, do their thing, close, and get recycled, I suspect that this is just a problem you're going to have. One thing you can do is catch the BindException in your try block, use that to do anything you need to do in the bind-unsuccessful case, and wrap the whole call in a while loop that depends on a flag indicating whether the bind succeeded. Off the top of my head:

boolean hasBound = false;
while (!hasBound) {
    try {
        hasBound = true;
        responseCode = httpClient.executeMethod(get);
    } catch (BindException e) {
        // do anything you want in the bound-unsuccessful case
    } catch (final IOException e) {
        ...
    }
}

Update with question: One curious question: what are the maximum total and per-host number of connections allowed by your MultiThreadedHttpConnectionManager? In your code, that'd be:

CONN_MGR.getParams().getDefaultMaxConnectionsPerHost();
CONN_MGR.getParams().getMaxTotalConnections();
delfuego
Per host: 2, max: 20. How in the world could I be exceeding any socket limits with such conservative defaults... :-p
Steven Schlansker
A: 

Thus, you've fired more requests than TCP/IP ports are allowed to be opened. I don't do HttpClient, so I can't go in detail about this, but in theory there are three solutions for this particular problem:

  1. Hardware based: add another NIC (network interface card).
  2. Software based: close connections directly after use and/or increase the connection timeout.
  3. Platform based: increase the amount of TCP/IP ports which are allowed to be opened. May be OS-specific and/or NIC driver-specific. The absolute maximum is 65535, of which several may already be reserved/in use (e.g. port 80).
BalusC
Wouldn't it fail with an exception more oriented towards being out of said resources? Most of the resource-exhaustion errors come from the socket() call - the manpage mentions at least three different ways it can fail due to lack of resources, while the bind manpage doesn't have any. I'd expect the Java exception to indicate this. Do you still think this could be a problem?
Steven Schlansker
It would only fail with such an exception if HttpClient were coded to do so -- I haven't looked at the code for `HttpClient#executeMethod`, but from your experience, it appears they didn't. It'd be reasonable to think that they might have caught the BindException internally and retried the connection on a different client port, but then again, that'd open up a whole can of worms -- how many times do they retry? how do they know that's what you want to do? -- that it also makes sense that they just throw the BindException and let you decide.
delfuego
By looking at the trace you can see it's being thrown directly from the native Socket connection methods
Steven Schlansker
Sure, Steven, but my point is that `HttpClient` could catch that thrown exception and do something with it -- with the caveats I stated, that it'd still be tough to decide how many retries it should attempt, etc., and it's just as reasonable to expect `HttpClient` to just allow the exception to pass on so you can decide how to handle it yourself.
delfuego
Sure having me deal with it is fine; I'm just indicating that the comment that "you've fired more requests than TCP/IP ports are allowed to be opened" doesn't match the error I'm getting.
Steven Schlansker
A: 

A very good discussion of the problem you are running into can be found here. On the Tomcat side, by default it will use the SO_REUSEADDR option, which will allow the server to reuse sockets which are in TIME_WAIT. Additionally, the Apache http client will by default use keep-alives, and attempt to reuse connections.

Your problems seems to be caused by not calling releaseConnection on the HttpClient. This is required in order for the connection to be reused. Otherwise, the connection will remain open until garbage collector comes and closes it, or the server disconnects the keep-alive. In both cases, it won't be returned to the pool.

brianegge
A: 

So it turns out the problem was that one of the other HttpClient instances accidentally wasn't using the MultiThreadedHttpConnectionManager I instantiated, so I effectively had no rate limiting at all. Fixing this problem fixed the exception being thrown.

Thanks for all the suggestions, though!

Steven Schlansker