views:

394

answers:

1

Many linux/unix programming books and tutorials speak about the "Thundering Herd Problem" which happens when multiple threads or forks are blocked on a select() call waiting for readability of a listening socket. When the connection comes in, all threads and forks are woken up but only one "wins" with a successful call to "accept()". In the meantime, a lot of cpu time is wasted waking up all the threads/forks for no reason.

I noticed a project which provides a "fix" for this problem in the linux kernel, but this is a very old patch.

I think there are two variants; One where each fork does select() and then accept(), and one that just does accept().

Do modern unix/linux kernels still have the Thundering Herd Problem in both these cases or only the "select() then accept()" version?

+5  A: 

This is a very old problem, and for the most part does not exist any more. The Linux kernel (for the past few years) has had a number of changes with the way it handles and routes packets up the network stack, and includes many optimizations to ensure both low latency, and fairness (i.e., minimize starvation).

That said, the select system has a number of scalability issues simply by way of its API. When you have a large number of file descriptors, the cost of a select call is very high. This is primarily due to having to build, check, and maintain the FD sets that are passed to and from the system call.

Now days, the preferred way to do asynchronous IO is with epoll. The API is far simpler and scales very nicely across various types of load (many connections, lots of throughput, etc.)

0xfe
I'm thinking of the situation where there is only one socket per fork and the socket is blocking. So epoll() vs poll() vs select() doesn't matter here. Does linux have special task waking behaviour for listening sockets now?
jdkoftinoff
Yes, IIRC, the implementation now wakes up only one thread (as opposed to waking up all threads and having them compete.)How it picks the thread is complicated, but the simple answer is: it's priority based, so it relies on the scheduler to maintain fairness.Anyhow, that specific herding problem does not exist anymore so the situation isn't as dire upon overload.All this said, using select instead of epoll does incur a higher CPU cost (both in the kernel and in user-space), which is, IMO, negligible for your use-case.
0xfe
@0xfe - And what happens if that one thread does not do an `accept`?
Omnifarious