



I'm running a C# Console Application that is multi-threaded. The core process retrieves some data to work on, splits it up into a configurable number of smaller datasets, and then spawns the same number of threads to process each subset of data.

To process an individual record, a thread has to make a call to a web service using the WebRequest class and POST method. The query is sent with GetRequestStream(), and the response is retrieved with GetResponse().

In pseudo-code, the routine looks something like this:

prepare WebRequest data;
* get time (start-of-Processing);
Stream str = request.GetRequestStream();
Write data to stream;
WebResponse resp = request.GetResponse();
* get time (response-received);
process response;
finally close response stream;

Timing data suggests that when we split our data into more than 4 threads, our throughput for the process as a whole does not improve, and in some cases even drops. Timing data from the web-service maintains their performance remains constant.

  • At 4 threads, our apparent overhead to send the data and retrieve the response stream averages around a second.
  • When we run more than 4 threads, the average rises with maximum values encountered of tens of seconds!

Today I was able to run two separate processes, each running 4 threads (but essentially ensuring that each thread was still running on unique data). This time, we nearly doubled our overall throughput and each process had stable timing of about a second.

This leads me to believe we are hitting some kind of limitation on resources in relation to the WebRequest class; but it is a per-process limitation, not a machine limitation. I am aware that we could make our calls asynchronously with BeginGetRequestStream and BeginGetResponse, but I'm sceptical that it will have a positive impact if we are in fact hitting some kind of resource limit?!

What should I look at to enable us to raise the number of splits within the single process without the drop in performance?

You need to raise the number of simultaneous web requests you can make to a single host - otherwise your threads will basically be waiting for each other to finish, despite there being plenty of CPU available. The easiest way to do this is to use the <connectionManagement> element of app.config:

      <add address = "*" maxconnection = "100" />
Jon Skeet
Thanks Jon - this sounds hopeful... I'll give more feedback once I've had a chance to test this which will be tomorrow :)
Thank You Thank You Thank You John! Not only did this config change enable me to up the number of threads I was running, it also reduced that 'one second' overhead by quite a bit - so I must already have been getting quite a bit of contention.

How many processors/cores does the computer that you're running this on have?

When you schedule more threads than there are cores in your system, the scheduler has to time-slice each thread and schedule them to run on the available cores. So, unless there is dead-time in your process the performance won't increase and may actually drop - which is what you're describing.

Miky Dinescu
If the web requests are taking about a second each, that sounds like the application is very far from CPU bound - and the fact that it runs twice as fast when there are two processes confirms that.
Jon Skeet
I guess that make sense.. My reasoning was that since he said 4 threads work fine but anything more degrades performance, and since quad-cores are very popular, it seemed like a possible cause for the trouble. But the more I think about it it doesn't make sense..
Miky Dinescu
We're just about to move to a quad core, but it was on a dual core that we really found the 4-request 'limit'. CPU runs at about 1-2% on this process, as does Network (according to Task Manager) so neither of those appear to be problems...