views:

46

answers:

2

We're building a test harness to push binary messages out on a UDP multicast.

The prototype is using the Twisted reactor loop to push out messages, which is achieving just about the level of traffic we require - about 120000 messages per second.

We have a 16 cores on our test machine, and obviously I'd like to spread this over those cores to really make the harness fly.

Does anyone have any ideas about how we might architect the application (either using an event loop approach or a CSP style approach) to up this output.

Also most of the time in the prototype is spent writing to UDP - as IO I shouldn't be surprised, but am I missing anything?

Any ideas welcome.

+1  A: 

Multiple NICs, the hardware or the kernel interface is the limit. I can only reach 69,000 packets per second with a Broadcom Corporation NetXtreme BCM5704S Gigabit Ethernet adapter. Try a quad Intel Gigabit Server Adapter with all four NICs on the same subnet.

Steve-o
+1  A: 

The obvious answer when the question of exploiting multiple cores in a Python application comes up is to use multiple processes. With Twisted, you can use reactor.spawnProcess to launch a child process. You could also just start 16 instances of your application some other way (like a shell script). This requires that your application can operate sensibly with multiple instances running at once, of course. Exactly how you might divide the work so that each process can take on some of it depends on the nature of the work.

I would expect a single GigE link to be saturated long before you have all 16 cores running full tilt though. Make sure you're focusing on the bottleneck in the system. As Steve-o said, you may want multiple NICs in the machine as well.

Jean-Paul Calderone