high performance udp server. blocking or non-blocking? c

views:

651

answers:

+3 Q:

high performance udp server. blocking or non-blocking? c

hi, i've been doing a lot of reading on blocking vs non-blocking sockets for udp, but am having a hard time understanding the merits of one over the other. the overwhelming majority of comments on the internet seem to indicate that non-blocking is better, but aren't very specific as to what scenarios they would be better in and i've found no references so far as to when blocking is preferred. my hope with this question is that the community may be able to shine a little light on this subject.

a little background on my own problem set, so that answers could potentially be applied specifically as well as to the general nature of the question. i have a udp server that i'm writing that will have 40 connections on a local lan whereby a constant stream of data will be flowing in. data rates will be around 250MB/s avg with peaks to 500+Mb/s with the avg datagram size at around 1400 bytes. processing of the datagrams is light, but due to the large volume of msgs efficiency and performance are a high priority in order to prevent dropped packets.

as i've not been able to really find any contextual information for something resembling this particular problem set, i've had to make a few guesses based on what i've been able to glean about blocking vs non-blocking. i'll just end this with what my current hypothesis is, and then open it up to your input. basically, since this will be an almost constant stream of packets on every connection, i'm thinking a blocking socket would be preferable due to the fact that the time any recv function will actually spend blocked would be very very minimal vs. using an event based model which would have an overwhelming amount of triggers in asyncrhonous mode. i feel my true problem set will most likely be priority management for the 40 threads i'm planning on using to read from the sockets... making sure each get their share of cpu time. i may be incorrect in my approach and ideas, so i'm hoping and would be very appreciative if the community could help shine some light on the matter.

regards, jim.

~edit~

while i am concerned with how threading design will influence/integrate with the blocking/non blocking question. i really am mostly concerned with how blockin/non blocking should be viewed from the perspective of my problem set. if threading does indeed become an issue, i can go with a thread-pool solution.

~edit2~

first, wanted to say tank you for the responses so far. a few of you have mentioned that the single-thread/socket model with this many sockets may be a bad idea and i admit that i was tentative with the solution myself. however, in one of the links in nikolai's reponse, the author discusses a single-thread/socket model and links to a very interesting paper that i thought i would link to here as it dispels a lot of myths that i held about threads vs event based models: why events are a bad idea

enjoy.

I'm not so sure using 40 threads for 40 sockets is a great idea... Sure, using a thread per socket makes sense when you have a small number of sockets, however having that many threads is just asking for thread starvation, deadlock, and missed packets.

As for blocking vs. non-blocking, remember that blocking can be relatively costly... though in my opinion it is easier to work with. Async triggers etc are probably faster overall then having to block/wake a thread.

Polaris878 2010-02-10 04:34:42

while deadlocks won't be an issue as each connection will be indedpendently processed, missed packets and thread starvation could be a problem. i'm more concerned with WHY exactly blocking is more costly, epsecially as it applies to this example. if the call's rarely block, i'm not sure how that's better than a flurry of event notices.

jim 2010-02-10 15:38:44

+3 A:

Not an answer, just some links if you don't yet have them in your bookmarks:

The C10K problem by Dan Kegel,
High-Performance Server Architecture by Jeff Darcy,
Advanced poll APIs: epoll(4), kqueue(2).

Edit:

As dumb as it may sound, but I totally missed that you are working with UDP, so ...

Since there are no protocol-level connections in UDP, and unless you have to work on different ports, you don't need 40 sockets on the server. Just one UDP "server" socket will do for all the clients. You can block on this one socket all you like just make sure the socket receive buffer is large enough to accommodate traffic spikes and don't spend too much time processing each read.

Nikolai N Fetissov 2010-02-10 18:18:23

hi, thanks for the reply nikolai and thank you for the links, will look at them now. the 40 different connections i mentioned are one ip 40 ports. so yes, unfortunately, i'm working on different ports which require 40 different sockets.

jim 2010-02-10 22:15:59

+1 A:

I don't know that blocking or non-blocking has a significant performance advantage; it's more a question of what sort of things your network I/O event loops want to do:

If the only thing your network I/O thread is ever going to do is listen for incoming UDP packets on a single socket, then blocking I/O will probably work fine and will be easier to program.
If your network I/O thread needs to handle more than one socket, then blocking I/O becomes problematic, because if it is blocking on socket A, it won't get woken up to handle data arriving on socket B, or vice versa. In this case, non-blocking I/O becomes preferred, since you can do your blocking in select() or poll(), which will return whenever data is available on any of the watched sockets.

Note that even in the non-blocking case you wouldn't want to busy-loop between packets, since burning CPU cycles in thread A means they won't be available to thread B, which would hurt performance. So if you aren't blocking in recv(), be sure to block in select() or poll() instead.

Jeremy Friesner 2010-02-10 18:28:20

Also, if you want to minimize dropped incoming UDP packets, I'd recommend the following: (1) do as little as possible in your UDP-receiver threads. Ideally just receive the packets, then add them to a queue for processing by another thread, and get back to select()/recv() ASAP (2) Run the receiver threads at higher priority than the processing thread(s) (3) Use setsockopt(fd, SOL_SOCKET, SO_RCVBUF, ...) to make the incoming data buffer for your UDP sockets as large as possible.

Jeremy Friesner 2010-02-10 18:31:48

You should rephrase your answer. As it is I'm itching to dowvote as select or poll works perfectly well with blocking I/O (that just what they are made for, block on several handles at once and unblock if any has data).

kriss 2010-02-10 21:07:03

+1 A:

when using blocking IO, at some point in your program you should have a poll or a select waiting for data on any of your file handles (in your case sockets). Because if you read on any of your fh without ensuring data is ready on it it will block and the program will stop managing other sockets. To avoid this and keep program simple programs using blocking IO are often written with one thread for each socket/fh, thus avoiding the need for poll or select.
if you use non blocking IO your program will just run and check for data arrival as it see each read. No need for poll or select. The program can still be quite simple and there is no need either to use thread for this particular purpose.

I believe the most efficient approach is to use poll or select to manage several IO at once (it can be a subset of all filehandles splitted between threads If you prefer). It is more efficient than non blocking IO without poll or select because this method basically try to read on every sockets most of the time uselessly and that has a cost. The worst method between these three is to use blocking IO with one fh for each thread, because of the high cost of thread management compared to a read returning WOULDBLOCK or of a poll.

This said, non-blocking IO has another advantage: Your program may have computations to do besides IO, and when blocked waiting IO you can't do it. This can lead to use poll/select with non-blocking IO or use it with a small timeout, or even use small specialized thread dedicated to IO and other threads for more computationally intensive parts or the program.

In some cases you may also not have any choice. I happened having to wait data from a file handle mounted through NFS. In such a case trying to set non-blocking IO is useless, because the NFS layer use blocking IO internally...

You may also consider using asynchronous IO. It's very efficient, your program becomes 'event driven'. It's quite usual for windows systems, I haven't looked the current developpement state of asynchronous IO for Linux. Last time I checked some people where working to add asynchronous IO to kernel API, but I don't know if it's stable or reached mainstream kernels.

kriss 2010-02-10 21:43:47

ansaurus

tags:

views:

answers:

high performance udp server. blocking or non-blocking? c

Edit:

related questions