It's likely in most applications that the listener itself is not the bottleneck. More often, you need to respond to those TCP requests by doing some kind of work, and the amount of workers your application can effectively use at one time will be smaller than the number of connections it could theoretically sustain.
This probably explains Alex Martelli's "rule of thumb" in the comments:
I always tend to use 5
, but I have no sharply reasoned explanation for why.
On a quad-core server this makes sense if clients kick off CPU-intensive tasks; 4 connections for 4 workers/threads/cores, and one more to keep the service responsive when it's heavily-loaded.
If your work is not CPU-intensive, however, then the above limit/explanation is irrelevant for you, you might be able to manage 10 or 50 connections or you might need to use some machine-exclusive resource and only allow one. I basically agree with Hans in that there's no hard and fast "right answer" for this, you'll need to look at what your application does, make an estimate as to how many connections you think it can actually handle, and tweak it for maximum efficiency by testing.