views:

516

answers:

6

I have an asynchronous application executing several threads doing operations over sockets where operations are scheduled and then executed asynchronously.

I'm trying to avoid a situation when once scheduled a read operation over a socket, the socket gets closed and reopened(by possibly another peer in another operation), before the first operation started execution, which will end up reading the proper file descriptor but the wrong peer.

The problem comes because (accept();close();accept()) returns the same fd in both accepts() which can lead to the above situation.

I can't see a way of avoiding it.

any hint?

+2  A: 

How do you manage the sockets? It sounds like you have multiple threads any of which can:

  1. accept an incoming connection
  2. close an existing connection
  3. make a new outgoing connection

It sounds like you need a way to mediate access to the various sockets floating around. Have you considered associating each socket with mutex which prevents closing the socket while it's still in use, or maybe putting each socket descriptor in a struct with an atomic reference count which will prevent other threads from closing it until all threads are done using it?

Robert S. Barnes
I could do it that way, but adding mutexes to this almost lockfree system is something I try to avoid.The best thing would be to guarantee accept won't give fd sequentially, but I can't see that happening.
Arkaitz Jimenez
@Arkaitz - then the second solution I mentioned might be good for you. Keep the socket in a struct with a reference counter and just don't let any threads close a socket with a reference count above zero. No lock, and the socket doesn't get closed until all the threads are done with it.
Robert S. Barnes
A: 

a socket is a 5-tuple {local-addr,local-port,remote-addr,remote-port,proto}, so if you are able to use these properties instead of fd for event/handler routing you can avoid the fd clash.

another option would be to serialize all close()/accept() operations (priorities ?) so that they cannot intermix

catwalk
I'd still need mutexes to protect those structs, which will slow the system more, being an edge situation I'm trying to avoid locking there.
Arkaitz Jimenez
A: 

Keep a count of the pending operations (read/write, etc.) for each socket and also whether there is a pending close request on the socket. Where before you would have called close, check first whether there are any pending operations. If there are, call shutdown instead, and then only call close when the pending operations count reaches 0.

atomice
+1  A: 

Great question! I didn't even realize that such a problem could occur.

The only answer that I can think of is that you musn't use close() to signal that a socket is terminated. One solution is to use shutdown() to terminate the connection. You could then close() the socket safely by employing reference counting.

TrayMan
The `close` function decrements the sockets ref count in the underlying OS, `shutdown` forces a close and sends an EOF / FIN to the peer. See this SO post for more details: http://stackoverflow.com/questions/409783/socket-shutdown-vs-socket-close/598759#598759
Robert S. Barnes
As mentioned in the link you posted, shutdown() does not close the socket, merely terminates communication. Shutdown() would cause any pending (or future) I/O operation to fail, but would not deallocate the FD, thus the described problem would not occur. In the event that an operaton fails, the thread doing the operation would decrements a reference count (which had been previously incremented when the thread received the socket), if that goes zero it closes the socket. Obviously this refcount would be something implemented in the program, not the OS refcount.
TrayMan
You don't even need to worry about a reference count, just close the socket if the read operation fails.
atomice
@TrayMan - The OP said that part of the problem is that pending operations need to complete before the connection is torn down. Your solution is incomplete because it doesn't address this part of the problem.
Robert S. Barnes
That's not how I read it. But to do that, you just need to drop the shutdown part from my solution. The last thread operating on the socket will then close it.
TrayMan
+1  A: 

Ok, found the answer.

The best way here is to call accept() and get the lowest fd available, duplicate it with a number known by you like dup2(6,1000) and close(6), you have now control of the fd range you use.

Next accept will come again with 6 or similar, and we'll dup2(6,999); and keep decreasing like that and reseting it if it gets too low.

Since the accepting is done always in the same thread and dup2 and close aren't expensive compared to accept which is always done there it's perfect for my needs.

Arkaitz Jimenez
+1  A: 

I would still be careful of using dup2() to a well-known fd value. Remember that dup2() will perform a close on the target before duping; that could conflict with some unrelated thread doing unrelated I/O if you start to have 1000 files open.

If I were you, given the constraints you're insisting upon, I would use dup() (not dup2()) inside of a mutex. (Maybe per-fd mutexes if you're that concerned about it.)

asveikau
Arkaitz Jimenez
But can you prove that you're not doing unrelated file opens? Or maybe some library is doing thme on your behalf? And can you prove that the count of naturally-ocurring FDs won't eventually reach 1000? Maybe you can guarantee these things, but this sort of thing would make me nervous with your approach.
asveikau