views:

1073

answers:

11

Does anyone have any good resources that show creating an unlimited number of threads in C# WITHOUT using a ThreadPool?

I realize one might question the architecture of a system with hundreds or thousands of threads, so let me explain the task in case the CPU/OS will make this effort moot.

I have about 2500 URLs that I need to test. Some of them are very slow: in excess of 10 seconds to respond. In any event, network latency makes up about 99.99% of each operation.

I'd like to test all 2500 urls as fast as possible.

I wired up a test that tests each of them in their own thread.

The problem is I'm using the ThreadPool and I think the default limit is 25, so that's no good. I need to manage them manually. Am I way out to lunch here?

I realize the CPU/OS will probably also limit the number of concurrent threads per CPU, but I trust this limit is WAY higher than 25.

Regarding architecture, I realize I may be locking up an entire box if I was to wire up 2 thousand HTTP threads, but this is an admin task that runs in isoloation and can take as many resources as are available.

Thanks for your insights.

+9  A: 

You cannot create an unlimited number of threads. You will run into many problems if you try.

However, you can increase the default number of threads in the ThreadPool in C#. Just use ThreadPool.SetMaxThreads to give the thread pool more threads with which to work. It will likely do a better job than any manual threading attempts (without putting a LOT of effort into the manual process).

Reed Copsey
right, so, 256 threads per CPU is the maximum.I'm assuming that the best practice recommendation is to not exceed this, as the switching overhead becomes counter productive.The main reason I'm inquring though, is because the network latency per thread takes SO long that I'm willing to incur a large degree of threading related overhead so as to not wait in line on the network.
Scott
Just try using your current approach, but setting the max threads on the thread pool much higher. THat will likely be your best option. Not all of your test cases are going to take a long time, I'm sure - so those will pump through fast, and you'll end up just waiting on the bad situations. 2500 threads for 2500 URLs would actually probably take longer due to switching and thread creation overhead, even if you could do it.
Reed Copsey
Your effective thread limit may only be around 2000 threads, depending upon the default stack size assigned by the linker to each thread. See http://blogs.msdn.com/oldnewthing/archive/2005/07/29/444912.aspx.
ebpower
+3  A: 

Well, the maximum maximum number of threads in ThreadPool is 256, so if you need more, you'll have to do it manually. (Edit: Whoops -- that's compact framework only)

Starting a new thread manually is as easy as:

Thread newThread = new Thread(new ThreadStart(myWorkerMethod));
newThread.Start();

That said, you should probably reconsider your approach. If you need that many threads, odds are you're doing it wrong.

Randolpho
But doesn't this just uses a thread from the threadpool, which as stated in other numbers cannot exceed 256 per CPU.
David McEwing
No. This uses a thread separate from the thread pool.
Reed Copsey
+1  A: 

Although the need to test 2500 URLs suggests that you need 2500 threads, it is highly unlikely you will need all 2500. The ThreadPool will rapidly recycle those threads that reach a web address that responds quickly.

So you may see peaks of a few dozen threads. Beyond that, I doubt that more threads will substantially increase performance. You will reach a point of diminishing returns due to the thread overhead.

Robert Harvey
+4  A: 

You should also be aware that Windows XP (and possibly Vista/Win7) have a limit on the number of half-open TCP connection you can have (10). If you are waiting for sites to respond that don't exist, adding more threads won't get around this problem.

Jon Tackabury
That limit has been removed for Vista SP2 and Windows 7
ebpower
Thanks for the heads-up. :)
Jon Tackabury
A: 

You can run multiple istances of the application, and give it higher priority on the task manager

bashmohandes
+1  A: 

I had a similar challenge and instead of using the thread pool, I created a thread for each URL I wanted to hit, added it to a Queue, and then popped each thread off of the queue and started it. I kept track of the number of running threads, and as each one completed then I grabbed the next waiting thread. In my user interface I could tweak the maximum running connections and also watch the number of queued connections.

What you're up against is the number of active connections more than the number of active threads, as these threads are blocked while they're waiting for a response. The thread pool reuses threads and saves you the overhead of creating and starting threads, which helpswhen your processing is CPU-intensive; 25 is a reasonable limit unless you have a lot of cores. But when you're waiting on a network connection, the thread overhead is insignificant.

You can set the maximum limit (defaults to two) in your app.config by setting the maxconnection value: http://msdn.microsoft.com/en-us/library/aa903351%28VS.71%29.aspx.

You can create a large number of threads, but you only get "charged" for threads you start. You're limited by system resources, but I've been able to get away with hundreds without a serious performance hit.

ebpower
+1  A: 

You might want to read The C10K problem which is about building software to handle more than ten thousand connections; most of the approaches it lists have analogues in Windows. There's an introduction to asynchronous sockets in C# on codeguru. Basically in asynchronous IO, rather than switching thread contexts and each thread testing one socket, an event driven approach is used which hooks into the OS socket implementation to report the sockets which are available. You also might want to tweak some of the Windows TCP settings in the registry, such as the maximum number of connections.

Pete Kirkham
+2  A: 

You may also want to look into a different ThreadPool provider; MiscUtil.Threading.CustomThreadPool from MiscUtil (authored by Jon Skeet) provides a Custom Thread Pool implementation that allows you to specify a maximum number of threads, while also ensuring that only one set of tasks/applications is using the Thread Pool.

More precisely; while you may want to spin up 50-1000 threads, you probably shouldn't go through and rewrite the wheel as far as work distribution to threads goes.

Also, if you are using HttpWebRequest and HttpWebResponse for url checking; you'll probably also need to modify: ServicePointManager.DefaultConnectionLimit . By default, there's a limit on the number of concurrent WebRequests that can be out (either 2 or 10), which rather hampers any possible benefit from having a machine capable of running several hundred threads.

CoderTao
Thanks CoderTao, the ServicePointManager limit also improved performance.
Scott
+2  A: 

Im not sure if this answer is already proposed but i dont see the need for more than 2/3 threads.

One thread just does all the request and finishes. The second thread waits for the replies, once a reply arrives, it enqueues the replies in a reply queue. The third thread dequeues and processes the replies.

Just plain and simple.

There is one BUT, im not sure if you can asynchronously receive http replies in .net I assume you can, but im not sure.

Either i'm missing the point completely or you guys are thinking way too complicated.

Henri
Yes, you can do asynchronous HTTP operations in .Net
Frank Schwieterman
That's pretty good Henri...the only concern I have is what limitations does WebClient impose on Asyncronous downloads?
Scott
+2  A: 

Thanks a lot guys...I wish I could accept multiple answers.

What I did was

a.) Set the MaxThreads property to 100. b.) Place the

<system.net>
    <connectionManagement>
      <add address="*" maxconnection="100" />
    </connectionManagement>
  </system.net>

code inside config

c.) Up the TCP/IP limit of XP to 100 from 10.

d.) Modify the ServicePointManager.DefaultConnectionLimit to 100.

These solutions combined greatly increased the performance.

Now I see though that Henri's comment makes a good deal of sense.

I may not even need threads...I could have the same thread fire a call to WebClient.DownloadStringAsync which would simulate the threads I have but be much simpler.

The problem is again, I may run into internal WebClient/.NET limitations that I would then need to work around...

Scott
A: 

What you want is "c# asynchronous webrequest ". You don't have to worry about threats, pool full of bitches nor slut web sites. Have a look at : http://stackoverflow.com/questions/202481/how-to-use-httpwebrequest-net-asynchronously and let the poor dicks rest in piece.