views:

461

answers:

4

I do not understand what the difference is between calling recv() on a non-blocking socket vs a blocking socket after waiting to call recv() after select returns that it is ready for reading. It would seem to me like a blocking socket will never block in this situation anyway.
Also, I have heard that one model for using non blocking sockets is try to make calls (recv/send/etc) on them after some amount of time has passed instead of using something like select. This technique seems slow and wasteful to be compared to using something like select (but then I don't get the purpose of non-blocking at all as described above). Is this common in networking programming today?

+1  A: 

The select, poll, epoll, kqueue, etc. facilities target multiple socket/file descriptor handling scenarios. Imagine a heavy loaded web-server with hundreds of simultaneously connected sockets. How would you know when to read and from what socket without blocking everything?

Nikolai N Fetissov
A: 

If you call read on a non-blocking socket, it will return immediately if no data has been received since the last call to read. If you only had read, and you wanted to wait until there was data available, you would have to busy wait. This wastes CPU.

poll and select (and friends) allow you to sleep until there's data to read (or write, or a signal has been received, etc.).

If the only thing you're doing is sending and receiving on that socket, you might as well just use a non-blocking socket. Being asynchronous is important when you have other things to do in the meantime, such as update a GUI or handle other sockets.

daf
A: 

For your first question, there's no difference in that scenario. The only difference is what they do when there is nothing to be read. Since you're checking that before calling recv() you'll see no difference.

For the second question, the way I see it done in all the libraries is to use select, poll, epoll, kqueue for testing if data is available. The select method is the oldest, and least desirable from a performance standpoint (particularly for managing large numbers of connections).

Jay
It's possible for `select` to indicate a socket is readable, but for it to then have no data available when you go to read it. It's very much an edge case, but it can happen in situations like a packet with a bad checksum was recieved (which is only checked when the application goes to read the data).
caf
+2  A: 

There's a great overview of all of the different options for doing high-volume I/O called The C10K Problem. It has a fairly complete survey of a lot of the different options, at least as of 2006.

Quoting from it, on the topic of using select on non-blocking sockets:

Note: it's particularly important to remember that readiness notification from the kernel is only a hint; the file descriptor might not be ready anymore when you try to read from it. That's why it's important to use nonblocking mode when using readiness notification.

And yes, you could use non-blocking sockets and then have a loop that waits if nothing is ready, but that is fairly wasteful compared to using something like select or one of the more modern replacements (epoll, kqueue, etc). I can't think of a reason why anyone would actually want to do this; all of the select like options have the ability to set a timeout, so you can be woken up after a certain amount of time to perform some regular action. I suppose if you were doing something fairly CPU intensive, like running a video game, you may want to never sleep but instead keep computing, while periodically checking for I/O using non-blocking sockets.

Brian Campbell
Such busy loops are fairly common when low latency is desired and wasting cycles is covered by some $$$ (read - financial industry :). The idea is, basically, not to let the kernel sleep inside the syscall thus gapping for at least two context switches.
Nikolai N Fetissov
Fair enough. So there are some reasons why you might do it, but for most applications, you generally want to use a `select` like loop.
Brian Campbell