views:

422

answers:

4

Sorry for this dumb question on Asynchronous operations. This is how I understand it.

IIS has a limited set of worker threads waiting for requests. If one request is a long running operation, it will block that thread. This leads to fewer threads to serve requests.

Way to fix this - use asynchronous pages. When a request comes in, the main worker thread is freed and this other thread is created in some other place. The main thread is thus able to serve other requests. When the request completes on this other thread, another thread is picked from the main thread pool and the response is sent back to the client.

1) Where are these other threads located? Is there another thread pool?

2) IF ASP.NET likes creating new threads in this other thread pool(?), why not increase the number of threads in the main worker pool - they are all running on the same machine anyway? I don't see the advantage of moving that request to this other thread pool. Memory/CPU should be the same right?

3) If the main thread hands off a request to this other thread, why does the request not get disconnected? It magically hands off the request to another worker thread somewhere else and when the long running process completes, it picks a thread from the main worker pool and sends response to the client. I am amazed...but how does that work?

+1  A: 

1) The threads are in w3svc or whatever process is running the ASP.NET engine in your particular version of IIS.

2) Not sure what you mean here. You actually have control over how many threads are in the worker thread pool. This article is pretty good: http://msdn.microsoft.com/en-us/library/ms998549.aspx

3) I think you are confusing Requests and connections... To be honest, I haven't a clue how the internals of IIS works, but generally in applications that handle multiple requests simultaneously there is ONE master listening thread that will then hand off the actual work to a child thread (and do nothing else). The original request is not "disconnected" because these things are happening at completely different levels of the network protocol stack. Windows Server has no problem accepting multiple connections on TCP port 80. Think about how TCP/IP works and the fact that it is sending multiple discrete packets of information. You are thinking of "connection" like a single hose going from spigot A to spigot B, but of course that's not how it really works. It is more akin to a bucket that is just collecting whatever gets spilled into it.

Hope this helps.

Bryan
A: 

The answer also depends on which version of IIS you're talking about. In earlier versions, ASP.NET did not use "IIS threads". They were .NET ThreadPool threads. In IIS 7, the IIS and ASP.NET pipelines have been merged. I don't know which threads ASP.NET uses now.

The bottom line is, don't spawn your own threads.

John Saunders
+4  A: 

Async pages in ASP.NET use asynchronous callbacks, and asynchronous callbacks use the Thread Pool, and it is the same thread pool used to serve ASP.NET requests.

However, it's not quite that simple. The .NET ThreadPool has two types of threads - worker threads and I/O threads. I/O threads use what's called an I/O Completion Port, which is (greatly oversimplifying here) a thread-free or thread-agnostic means of waiting for a read/write operation on a file handle to complete, subsequently running a callback method.

(Note that a file handle does not necessarily refer to a file on disk; as far as Windows is concerned, it could just as well be a socket, pipe, etc.)

A typical .NET web developer doesn't really need to know about any of this. Of course, if you were writing an actual web server, or any kind of network server, then you would definitely need to learn about these, because they are the only way to handle hundreds of incoming connections without actually spawning hundreds of threads to serve them. There's a Managed I/O Completion Port tutorial (CodeProject) if you're interested.

Anyway, getting back on topic; when you interact with the thread pool at a high level, i.e. by writing:

ThreadPool.QueueUserWorkItem(s => DoSomeWork(s));

This does not use an I/O completion port. Ever. It posts the work to one of the normal worker threads managed by thread pool. It's the same if you use async callbacks:

Func<int> asyncFunc;

IAsyncResult BeginOperation(object sender, EventArgs e, AsyncCallback cb,
    object state)
{
    asyncFunc = () => { Thread.Sleep(500); return 42; };
    return asyncFunc.BeginInvoke(cb, state);
}

void EndOperation(IAsyncResult ar)
{
    int result = asyncFunc.EndInvoke(ar);
    Console.WriteLine(result);
}

Again - same deal. Inside the EndOperation you're running on a ThreadPool worker thread. You can verify this by inserting the following debugging code:

void EndSimpleWait(IAsyncResult ar)
{
    int maxWorkers, maxIO, availableWorkers, availableIO;
    ThreadPool.GetMaxThreads(out maxWorkers, out maxIO);
    ThreadPool.GetAvailableThreads(out availableWorkers, out availableIO);
    int result = asyncFunc.EndInvoke(ar);
}

Slap a breakpoint in there and you'll see that availableWorkers is one less than maxWorkers, while maxIO and availableIO are the same.

But some async operations are "special" in .NET. This actually has nothing to do with ASP.NET directly - they'll use I/O completion ports in a Winforms or WPF app too. Examples are:

And so on, this is nowhere near a full list. Basically almost every class in the .NET Framework that exposes its own BeginXYZ and EndXYZ methods and could conceivably perform any I/O, probably uses I/O completion ports. That's to make it easier for you, the application developer, because I/O threads are kind of hard to implement yourself in .NET.

My guess is that the .NET Framework designers deliberately chose to make it difficult to post I/O operations (compared to worker threads, where you can just write ThreadPool.QueueUserWorkItem) because it's comparatively "dangerous" if you don't know how to use them properly; by contrast, it's actually pretty straightforward to spawn these in the Windows API.

As before, you can verify what's happening with some debugging code:

WebRequest request;

IAsyncResult BeginDownload(object sender, EventArgs e,
    AsyncCallback cb, object state)
{
    request = WebRequest.Create("http://www.example.com");
    return request.BeginGetResponse(cb, state);
}

void EndDownload(IAsyncResult ar)
{
    int maxWorkers, maxIO, availableWorkers, availableIO;
    ThreadPool.GetMaxThreads(out maxWorkers, out maxIO);
    ThreadPool.GetAvailableThreads(out availableWorkers, out availableIO);
    string html;
    using (WebResponse response = request.EndGetResponse(ar))
    {
        using (StreamReader reader = new
            StreamReader(response.GetResponseStream()))
        {
            html = reader.ReadToEnd();
        }
    }
}

If you step through this one, you'll see that the thread stats are different. The availableWorkers will match maxWorkers, but availableIO is one less than maxIO. That's because you're running on an I/O thread. That's also why you're not supposed to do any expensive computations in async callbacks - posting CPU-intensive work on an I/O completion port is inefficient and, well, bad.

All of this explains why it's strongly recommended that you use Async pages in ASP.NET when you need to perform any I/O operations. The pattern is only useful for I/O operations; non-I/O async operations will end up being posted to worker threads in the ThreadPool and you'll still end up blocking subsequent ASP.NET requests. But you can spawn a virtually unlimited number of async I/O operations and not give it a second thought; these won't use any threads at all until the I/O is complete and the callback is ready to begin.

So, to summarize - there is only one ThreadPool, but there are different kinds of threads in it, and if you're performing slow I/O operations then it's much more efficient to use the I/O threads. It's got nothing to do with CPU or memory, it's all about I/O and file handles.


As for #3, it's not really a question of "why doesn't the request get disconnected", more like a question of "why would it?" A socket doesn't get closed simply because there's no thread currently sending to or receiving data from it, same way your front door doesn't automatically close if there's nobody there to greet guests. Client operations may time out if the server doesn't answer them, and may subsequently choose to disconnect from their end, but that's another issue altogether.

Aaronaught
@Aaron: my understanding has always been the opposite. That they _were_ thread-pool threads, but that this was not meant for background tasks that were CPU bound, but for tasks that had to wait a while for, say, web service responses or database queries to complete.
John Saunders
@John: We definitely seem to agree on the second part. I *could* be wrong about the first part (I admitted as much), but I can't find any documentation that explicitly states one or the other, and I don't see how it's possible to avoid starving the thread pool any other way. I'd love to see a definitive explanation; as of now, this is my best understanding of the mechanism. I *do* know that it's possible to starve the ASP.NET thread pool by manually queuing items, so if async methods *do* use pool threads, ASP.NET must perform some strange black magic to prevent request starvation.
Aaronaught
@Aaron: it would depend on what you queue. If you queue something that starts a query then returns, it wouldn't be a big deal. If you queue something that computes for 1/2 second, that would be a problem.
John Saunders
@John: I am aware of the I/O vs. CPU issue, but your comments reminded me of the whole mostly-undocumented I/O completion port usage in .NET, so I decided to run a few tests to verify my suspicions. I've updated the answer, and you're fundamentally correct, they are `ThreadPool` threads, but aren't shared with worker threads, they are threads specifically reserved for IOCP callbacks.
Aaronaught
+1  A: 

You didn't say which version of IIS or ASP.NET you're using. A lot of folks talk about IIS and ASP.NET as if they are one and the same, but they really are two components working together. Note that IIS 6 and 7 listen to an I/O completion port where they pick up completions from HTTP.sys. The IIS thread pool is used for this, and it has a maximum thread count of 256. This thread pool is designed in such a way that it does not handle long running tasks well. The recommendation from the IIS team is to switch to another thread if you're going to do substantial work, such as done by the ASP.NET ISAPI and/or ASP.NET "integrated mode" handler on IIS 7. Otherwise you will tie up IIS threads and prevent IIS from picking up completions from HTTP.sys Chances are you don't care about any of this, because you're not writing native code, that is, you're not writing an ISAPI or native handler for the IIS 7 pipeline. You're probably just using ASP.NET, in which case you're more interested in its thread pool and how it works.

There is a blog post at http://blogs.msdn.com/tmarq/archive/2007/07/21/asp-net-thread-usage-on-iis-7-0-and-6-0.aspx that explains how ASP.NET uses threads. Note that for ASP.NET v2.0 and v3.5 on IIS 7 you should increase MaxConcurrentRequestsPerCPU to 5000--it is a bug that it was set to 12 by default on those platforms. The new default for MaxConcurrentRequestsPerCPU in ASP.NET v4.0 on IIS 7 is 5000.

To answer your three questions:

1) First, a little primer. Only 1 thread per CPU can execute at a time. When you have more than this, you pay a penalty--a context switch is necessary every time the CPU switches to another thread, and these are expensive. However, if a thread is blocked waiting on work...then it makes sense to switch to another thread, one that can execute now.

So if I have a thread that is doing a lot of computational work and using the CPU heavily, and this takes a long time, should I switch to another thread? No! The current thread is efficiently using the CPU, so switching will only incur the cost of a context switch.

So if I have a thread that makes an HTTP or SOAP request to another server and takes a long time, should I switch threads? Yes! You can perform the HTTP or SOAP request asynchronously, so that once the "send" has occurred, you can unwind the current thread and not use any threads until there is an I/O completion for the "receive". Between the "send" and the "receive", the remote server is busy, so locally you don't need to be blocking on a thread, but instead make use of the async APIs provided in .NET Framework so that you can unwind and be notified upon completion.

Ok, so you're #1 questions was "Where are these other threads located? Is there another thread pool?" This depends. Most code that runs in .NET Framework uses the CLR ThreadPool, which consists of two types of threads, worker threads and i/o completion threads. What about code that doesn't use CLR ThreadPool? Well, it can create its own threads, use its own thread pool, or whatever it wants because it has access to the Win32 APIs provided by the operating system. Based on what we discussed a bit ago, it really doesn't matter where the thread comes from, and a thread is a thread as far as the operating system and hardware is concerned.

2) In your second question, you state, "I don't see the advantage of moving that request to this other thread pool." You're correct in thinking that there is NO advantage to switching unless you're going to make up for that costly context switch you just performed in order to switch. That's why I gave an example of a slow HTTP or SOAP request to a remote server as an example of a good reason to switch. And by the way, ASP.NET does not create any threads. It uses the CLR ThreadPool, and the threads in that pool are entirely managed by the CLR. They do a pretty good job of determining when you need more threads. For example, that's why ASP.NET can easily scale from executing 1 request concurrently to executing 300 requests concurrently, without doing anything. The incoming requests are posted to the CLR ThreadPool via a call to QueueUserWorkItem, and the CLR decides when to call the WaitCallback (see MSDN).

3) The third question is, "If the main thread hands off a request to this other thread, why does the request not get disconnected?" Well, IIS picks up the I/O completion from HTTP.sys when the request initially arrives at the server. IIS then invokes ASP.NET's handler (or ISAPI). ASP.NET immediately queues the request to the CLR Threadpool, and returns a pending status to IIS. This pending status tells IIS that we're not done yet, but as soon as we are done we'll let you know. Now ASP.NET manages the life of that request. When a CLR ThreadPool thread invokes the ASP.NET WaitCallback (see MSDN), it can execute the entire request on that thread, which is the normal case. Or it can switch to one or more other threads if the request is what we call asynchronous--i.e. it has an asynchronous module or handler. Either way, there are well defined ways in which the request completes, and when it finally does, ASP.NET will tell IIS we're done, and IIS will send the final bytes to the client and close the connection if Keep-Alive is not being used.

Regards, Thomas

Thomas