views:

212

answers:

4

I have read that HttpURLConnection supports persistent connections, so that a connection can be reused for multiple requests. I tried it and the only way to send a second POST was by calling openConnection for a second time. Otherwise I got a IllegalStateException("Already connected"); I used the following:

try{
URL url = new URL("http://someconection.com");
}
catch(Exception e){}
HttpURLConnection con = (HttpURLConnection) url.openConnection();
//set output, input etc
//send POST
//Receive response
//Read whole response
//close input stream
con.disconnect();//have also tested commenting this out
con = (HttpURLConnection) url.openConnection();
//Send new POST

The second request is send over the same TCP connection (verified it with wireshark) but I can not understand why (although this is what I want) since I have called disconnect. I checked the source code for the HttpURLConnection and the implementation does keep a keepalive cache of connections to the same destinations. My problem is that I can not see how the connection is placed back in the cache after I have send the first request. The disconnect closes the connection and without the disconnect, still I can not see how the connection is placed back in the cache. I saw that the cache has a run method to go through over all idle connections (I am not sure how it is called), but I can not find how the connection is placed back in the cache. The only place that seems to happen is in the finished method of httpClient but this is not called for a POST with a response. Can anyone help me on this?

EDIT My interest is, what is the proper handling of an HttpUrlConnection object for tcp connection reuse. Should input/output stream be closed followed by a url.openConnection(); each time to send the new request (avoiding disconnect())? If yes, I can not see how the connection is being reused when I call url.openConnection() for the second time, since the connection has been removed from the cache for the first request and can not find how it is returned back. Is it possible that the connection is not returned back to the keepalive cache (bug?), but the OS has not released the tcp connection yet and on new connection, the OS returns the buffered connection (not yet released) or something similar? EDIT2 The only related i found was from [link text][1]

...when the application calls close() on the InputStream returned by URLConnection.getInputStream(), the JDK's HTTP protocol handler will try to clean up the connection and if successful, put the connection into a connection cache for reuse by future HTTP requests.

But I am not sure which handler is this. sun.net.www.protocol.http.Handler does not do any caching as I saw Thanks!

[1]: http://JDK Keep-Alive

[1]: http://download.oracle.com/javase/1.5.0/docs/guide/net/http-keepalive.html JDK_KeepAlive

+2  A: 

From the javadoc for HttpURLConnection (my emphasis):

Each HttpURLConnection instance is used to make a single request but the underlying network connection to the HTTP server may be transparently shared by other instances. Calling the close() methods on the InputStream or OutputStream of an HttpURLConnection after a request may free network resources associated with this instance but has no effect on any shared persistent connection. Calling the disconnect() method may close the underlying socket if a persistent connection is otherwise idle at that time.

Jim Garrison
I have read this. My question is how the connection returns in the buffer to be shared by other instances. I can not find the code that returns the connection to the cache
I'm not sure what your concern is. Are you seeing behavior that leads you to suspect a bug in the implementation?
Jim Garrison
I started looking the src code once I saw(using wireshark) that if I called disconnect() the TCP connection was still reused. In the src I could not see the connection placed back in the keepAlive cache once I closed the input stream of the first request. So there was no difference if I called disconnect() or not. This worried me that perhaps there is some kind of issue in the implementation unless I am missing something
+2  A: 

Should input/output stream be closed followed by a url.openConnection(); each time to send the new request (avoiding disconnect())?

Yes.

If yes, I can not see how the connection is being reused when I call url.openConnection() for the second time, since the connection has been removed from the cache for the first request and can not find how it is returned back.

You are confusing the HttpURLConnection with the underlying Socket and its underlying TCP connection. They aren't the same. The HttpURLConnection instances are GC'd, the underlying Socket is pooled, unless you call disconnect().

EJP
@EJP: I started looking the src code once I saw(using wireshark) that if I called disconnect() the TCP connection was still reused. There is a concurrent hashmap for the caching of url connections, and I see that when a new HttpURlConnection is created, if there is a connection in the pool it uses it (in the HttpURLConnection constructor). But I can not see how the connection is placed back in the keepAlive cache, when I close the streams (and read whole response). I see that the connection is returned if the request is a HEAD (http.finished() is called) but not for a POST.
@EJP: You are right the serverSockets are pooled in the concurrentHashMap but once a request is about to be done, the socket is removed from the ClientVector so that a new thread has to create a new connection. But this socket must return in the pool. This seems to happen only in putInKeepAliveCache();. But this is not called for a 200 OK with a response body. How is it possible to see the same behavior in the TCP connection using disconnect or not?
No, the *Sockets* are pooled. There are no ServerSockets in this context. And what you've found is that maybe not all of them are pooled.
EJP
A: 

I found that the connection is indeed cached when the InputStream is closed. Once the inputStream has been closed the underlying connection is buffered. The HttpURLConnection object is unusable for further requests though, since the object is considered still "connected", i.e. its boolean connected is set to true and is not cleared once the connection is placed back in the buffer. So each time a new HttpUrlConnection should be instantiated for a new POST, but the underlying TCP connection will be reused, if it has not timed out. So EJP answer's was the correct description. May be the behavior I saw, (reuse of the TCP connection) despite explicitly calling disconnect() was due to caching done by the OS? I do not know. I hope someone who knows can explain. Thanks.

No I don't think OS can do this. There is a way to force closure, by either forcing use of HTTP 1.0 (which does not support persistent connections), or using "Connection: close" header, if you really do NOT want any sharing of underlying TCP connetion.
StaxMan
@StaxMan:I know about the Connection header in HTTP.My confusion was due to the fact, that even though I explicitly closed the underlying socket, the connection was still reused. In my mind the behavior I was expecting was a 2(max) sec delay due to 3-way handshake and an HTTP connection over a new TCP connection. I did not see that and I thought that it was something related to OS caching
Ok thanks for clarification.
StaxMan
A: 

Hmmh. I may be missing something here (since this is an old question), but as far as I know, there are 2 well-known ways to force closing of the underlying TCP connection:

  • Force use of HTTP 1.0 (1.1 introduced persistent connections) -- this as indicated by the http request line
  • Send 'Connection' header with value 'close'; this will force closing as well.
StaxMan
@StaxMan:How do you "force use of HTTP1.0" using the HttpUrlConnection of JDK?
Hmmh. Good question -- I assumed there was a way, but I could not find one with quick googling. So I would use the second solution; it's better either way, and for this purpose works as well.
StaxMan