views:

1110

answers:

5

What is the behavior of the select(2) function when a file descriptor it is watching for reading is closed by another thread?

From some cursory testing, it does return right away. I suspect the outcome is either that (a) it still continues to wait for data, but if you actually tried to read from it you'd get EBADF (possibly -- there's a potential race) or (b) that it pretends as though the file descriptor were never passed in. If the latter case is true, passing in a single fd with no timeout would cause a deadlock if it were closed.

+1  A: 

I would expect that it would behave as if the end-of-file had been reached, that's to say, it would return with the file descriptor shown as ready but any attempt to read it subsequently would return "bad file descriptor".

Having said that, doing that is very bad practice anyway, as you'd always have potential race conditions as another file descriptor with the same number could be opened by yet another thread immediately after the other 2nd closed it, then the selecting thread would end up waiting on the wrong one.

As soon as you close a file, its number becomes available for reuse, and may get reused by the next call to open(), socket() etc, even if by another thread. Therefore you really, really need to avoid this kind of thing.

MarkR
I thought that it might return as ready too, but that's not quite right: the descriptor isn't actually in a ready state -- it's closed. And as you mention, by the time you go to use it it could be reassigned to something else.
Joe Shaw
You could avoid the race by using a mutex for the data structure containing the fd, though. But that would only work if the select() call had a timeout defined.
Joe Shaw
A: 

Isn't the point of select to avoid threads? In any case I think you need to play around with the options passed to select. Have a look at pselect which can stop race conditions in certain criteria by setting sigmasks.

Philluminati
pselect prevents race conditions with signals not with closing file descriptors by other threads.
bothie
A: 

It's a little confusing what you're asking...

Select() should return upon an "interesting" change. If the close() merely decremented the reference count and the file was still open for writing somewhere then there's no reason for select() to wake up.

If the other thread did close() on the only open descriptor then it gets more interesting, but I'd need to see a simple version of the code to see if something's really wrong.

dwc
A: 

The select system call is a way to wait for file desctriptors to change state while the programs doesn't have anything else to do. The main use is for server applications, which open a bunch of file descriptors and then wait for anything to do on them (accept new connections, read requests or send the responses). Those file descriptors will be opened in non-blocking io mode such that the server process won't hang in a syscall at any times.

This additionally means, there is no need for separate threads, because all the work, that could be done in the thread can be done prior to the select call as well. And if the work takes long, than it can be interrupted, select being called with timeout={0,0}, the file descriptors get handled and afterwards the work is being resumed.

Now, you close a file descriptor in another thread. Why do you have that extra thread at all, and why shall it close the file descriptor?

The POSIX standard doesn't provide any hints, what happens in this case, so what you're doing is UNDEFINED BEHAVIOR. Expect that the result will be very different between different operating systems and even between version of the same OS.

Regards, Bodo

bothie
I think it'll have undefined behaviour anyway, because it is impossible to remove the race condition of the file descriptor being closed *just before* the select and another one being opened with the same number.
MarkR
+1  A: 

From some additional investigation, it appears that both dwc and bothie are right.

bothie's answer to the question boils down to: it's undefined behavior. That doesn't mean that it's unpredictable necessarily, but that different OSes do it differently. It would appear that systems like Solaris and HP-UX return from select(2) in this case, but Linux does not based on this post to the linux-kernel mailing list from 2001.

The argument on the linux-kernel mailing list is essentially that it is undefined (and broken) behavior to rely upon. In Linux's case, calling close(2) on the file descriptor effectively decrements a reference count on it. Since there is a select(2) call also with a reference to it, the fd will remain open and waiting for input until the select(2) returns. This is basically dwc's answer. You will get an event on the file descriptor and then it'll be closed. Trying to read from it will result in a EBADF, assuming the fd hasn't been recycled. (A concern that MarkR made in his answer, although I think it's probably avoidable in most cases with proper synchronization.)

So thank you all for the help.

Joe Shaw