tags:

views:

410

answers:

2

I am re-writing the core NIO server networking code for my project, and I'm trying to figure out when I should "store" connection information for future use. For example, once a client connects in the usual manner, I store and associate the SocketChannel object for that connected client so that I can write data to that client at any time. Generally I use the client's IP address (including port) as the key in a HashMap that maps to the SocketChannel object. That way, I can easily do a lookup on their IP address and asynchronously send data to them via that SocketChannel.

This might not be the best approach, but it works, and the project is too large to change its fundamental networking code, though I would consider suggestions. My main question, however, is this:

At what point should I "store" the SocketChannel for future use? I have been storing a reference to the SocketChannel once the connection is accepted (via OP_ACCEPT). I feel that this is an efficient approach, because I can assume that the map entry already exists when the OP_READ event comes in. Otherwise, I would need to do a computationally expensive check on the HashMap every time OP_READ occurs, and it is obvious that MANY more of those will occur for a client than OP_ACCEPT. My fear, I guess, is that there may be some connections that become accepted (OP_ACCEPT) but never send any data (OP_READ). Perhaps this is possible due to a firewall issue or a malfunctioning client or network adaptor. I think this could lead to "zombie" connections that are not active but also never receive a close message.

Part of my reason for re-writing my network code is that on rare occasions, I get a client connection that has gotten into a strange state. I'm thinking the way I've handled OP_ACCEPT versus OP_READ, including the information I use to assume a connection is "valid" and can be stored, could be wrong.

I'm sorry my question isn't more specific, I'm just looking for the best, most efficient way to determine if a SocketChannel is truly valid so I can store a reference to it. Thanks very much for any help!

+1  A: 

If you're using Selectors and non-blocking IO, then you might want to consider letting NIO itself keep track of the association between a channel and it's stateful data. When you call SelectionKey.register(), you can use the three-argument form to pass in an "attachment". At every point in the future, that SelectionKey will always return the attachment object that you provided. (This is pretty clearly inspired by the "void *user_data" type of argument in OS-level APIs.)

That attachment stays with the key, so it's a convenient place to keep state data. The nice thing is that all the mapping from channel to key to attachment will already be handled by NIO, so you do less bookkeeping. Bookkeeping--like Map lookups--can really hurt inside of an IO responder loop.

As an added feature, you can also change the attachment later, so if you needed different state objects for different phases of your protocol, you can keep track of that on the SelectionKey, too.

Regarding the odd state you find your connections in, there are some subtleties in using NIO and selectors that might be biting you. For example, once a SelectionKey signals that it's ready for read, it will continue to be ready for read the next time some other thread calls select(). So, it's easy to end up with multiple threads attempting to read the socket. On the other hand, if you attempt to deregister the key for reading while you're doing the read, then you can end up with threading bugs because SelectionKeys and their interest ops can only be manipulated by the thread that actually calls select(). So, overall, this API has some sharp edges, and it's tricky to get all the state handling correct.

Oh, and one more possibility, depending on who closes the socket first, you may or may not notice a closed socket until you explicitly ask. I can't recall the exact details off the top of my head, but it's something like this: the client half-closes its end of the socket, this does not signal any ready op on the selection key, so the socketchannel never gets read. This can leave a bunch of sockets in TIME_WAIT status on the client.

As a final recommendation, if you're doing async IO, then I definitely recommend a couple of books in the "Pattern Oriented Software Architecture" (POSA) series. Volume 2 deals with a lot of IO patterns. (For instance, NIO lends itself very well to the Reactor pattern from Volume 2. It addresses a bunch of those state handling problems I mention above.) Volume 4 includes those patterns and embeds them in the larger context of distributed systems in general. Both of these books are a very valuable resource.

mtnygard
Wow, thanks for the large quantity (and quality) of help! I am researching this stuff as best I can now.-- I think you mean SocketChannel.register(), not SelectionKey.register(), right?-- It seems I can avoid some of the threading issues by only using one thread for my NIO, right?-- Does it seem reasonable to periodically check sockets for the TIME_WAIT status for candidates that should be closed? (the half-closed issue you mentioned)Thanks a ton!
ZenBlender
1) Yes, register() is on SelectableChannel (which SocketChannel extends). That was what I meant.2) You can use just one thread for the selection, then hand the channel off for read/write actions. If you're dealing with many concurrent connections, you'll find that one thread by itself can't keep up. It's the handoff that is tricky.3) Periodically looking for half-closed sockets has worked for me.
mtnygard
A: 

An alternative may be to look at an existing NIO socket framework, possible candidates are:

Michael Barker