views:

616

answers:

6

I don't see the different between C#'s (and VB's) new async features, and .NET 4.0's Task Parallel Library. Take, for example, Eric Lippert's code from here:

async void ArchiveDocuments(List<Url> urls) {
    Task archive = null;
    for(int i = 0; i < urls.Count; ++i) {
        var document = await FetchAsync(urls[i]);
        if (archive != null)
            await archive;
        archive = ArchiveAsync(document);
    }
}

It seems that the await keyword is serving two different purposes. The first occurance (FetchAsync) seems to mean, "If this value is used later in the method and its task isn't finished, wait until it completes before continuing." The second instance (archive) seems to mean, "If this task is not yet finished, wait right now until it completes." If I'm wrong, please correct me.

Couldn't it just as easily be written like this?

void ArchiveDocuments(List<Url> urls) {
    for(int i = 0; i < urls.Count; ++i) {
        var document = FetchAsync(urls[i]);       // removed await
        if (archive != null)
            archive.Wait();                       // changed to .Wait()
        archive = ArchiveAsync(document.Result);  // added .Result
    }
}

I've replaced the first await with a Task.Result where the value is actually needed, and the second await with Task.Wait(), where the wait is actually occurring. The functionality is (1) already implemented, and (2) much closer semantically to what is actually happening in the code.

I do realize that an async method is rewritten as a state machine, similar to iterators, but I also don't see what benefits that brings. Any code that requires another thread to operate (such as downloading) will still require another thread, and any code that doesn't (such as reading from a file) could still utilize the TPL to work with only a single thread.

I'm obviously missing something huge here; can anybody help me understand this a little better?

A: 

The call to FetchAsync() will still block until it completes (unless a statement within calls await?) The key is that control is returned to the caller (because the ArchiveDocuments method itself is declared as async). So the caller can happily continue processing UI logic, respond to events, etc.

When FetchAsync() completes, it interrupts the caller to finish the loop. It hits ArchiveAsync() and blocks, but ArchiveAsync() probably just creates a new task, starts it, and returns the task. This allows the second loop to begin, while the task is processing.

The second loop hits FetchAsync() and blocks, returning control to the caller. When FetchAsync() completes, it again interrupts the caller to continue processing. It then hits await archive, which returns control to the caller until the Task created in loop 1 completes. Once that task is complete, the caller is again interrupted, and the second loop calls ArchiveAsync(), which gets a started task and begins loop 3, repeat ad nauseum.

The key is returning control to the caller while the heavy lifters are executing.

James B
Note that "the heavy lifters" might not be executing. They might not be in parallel at all. They might be units of work that are to be scheduled on this thread whenever this thread goes idle, for example. I'm seeing a *lot* of conflation of asynchrony with parallelism and I am eager to disabuse people of that notion; asynchrony is often about *not* producing more worker threads. The key is actually return of control to the caller *after* "the heavy lifters" have done *something* to ensure that their task completes at some time in the future.
Eric Lippert
+8  A: 

There is a huge difference:

Wait() blocks, await does not block. If you run the async version of ArchiveDocuments() on your GUI thread, the GUI will stay responsive while the fetching and archiving operations are running. If you use the TPL version with Wait(), your GUI will be blocked.

Note that async manages to do this without introducing any threads - at the point of the await, control is simply returned to the message loop. Once the task being waited for has completed, the remainder of the method (continuation) is enqueued on the message loop and the GUI thread will continue running ArchiveDocuments where it left off.

Daniel
+1  A: 

The problem here is that the signature of ArchiveDocuments is misleading. It has an explicit return of void but really the return is Task. To me void implies synchronous as there is no way to "wait" for it to finish. Consider the alternate signature of the function.

async Task ArchiveDocuments(List<Url> urls) { 
  ...
}

To me when it's written this way the difference is much more obvious. The ArchiveDocuments function is not one that completes synchronously but will finish later.

JaredPar
This code would make a lot more sense if "archive" were a member or static variable, and not defined within the method... That being said, void returning asynchronous functions are perfectly valid and acceptable, and have a "fire and forget" meaning, as per the spec documentation.
Reed Copsey
Are you saying that the call to `ArchiveDocuments()` would look like `Task task = ArchiveDocuments(List<Url> urls);`? That doesn't seem right... and if I understand correctly, the caller isn't really waiting at all. In fact, if it cares about the outcome of `ArchiveDocuments()`, the whole scenario falls apart and doesn't work.
James B
@Reed, I agree they are indeed valid and correct. I just find it very misleading and a bit presumpstious as maybe I don't want to forget ;)
JaredPar
@James, I'm saying that I think the difference makes a bit more senes if you consider that `Task` is also a valid return.
JaredPar
+19  A: 

I think the misunderstanding arises here:

It seems that the await keyword is serving two different purposes. The first occurance (FetchAsync) seems to mean, "If this value is used later in the method and its task isn't finished, wait until it completes before continuing." The second instance (archive) seems to mean, "If this task is not yet finished, wait right now until it completes." If I'm wrong, please correct me.

This is actually completely incorrect. Both of these have the same meaning.

In your first case:

var document = await FetchAsync(urls[i]);

What happens here, is that the runtime says "Start calling FetchAsync, then return the current execution point to the thread calling this method." There is no "waiting" here - instead, execution returns to the calling synchronization context, and things keep churning. At some point in the future, FetchAsync's Task will complete, and at that point, this code will resume on the calling thread's synchronization context, and the next statement (assigning the document variable) will occur.

Execution will then continue until the second await call - at which time, the same thing will happen - if the Task<T> (archive) isn't complete, execution will be released to the calling context - otherwise, the archive will be set.

In the second case, things are very different - here, you're explicitly blocking, which means that the calling synchronization context will never get a chance to execute any code until your entire method completes. Granted, there is still asynchrony, but the asynchrony is completely contained within this block of code - no code outside of this pasted code will happen on this thread until all of your code completes.

Reed Copsey
+1 Great explanation! Short and to the point.
BFree
That makes a lot more sense, thanks. I think `yield` would have been a better keyword choice, but what's done is done, I guess.
zildjohn01
@zildjohn01: It's not "done" yet. An older preview used `yield`, but the team decided against it. They're actively taking feedback and suggestions at: http://social.msdn.microsoft.com/Forums/en-US/async/threads
Reed Copsey
+2  A: 

The ability to turn the program flow of control into a state machine is what makes these new keywords intresting. Think of it as yielding control, rather than values.

Check out this Channel 9 video of Anders talking about the new feature.

John Leidegren
+1, I quoted you in a comment on Eric Lippert's blog, hope that's okay.
zildjohn01
+2  A: 

Anders boiled it down to a very succinct answer in the Channel 9 Live interview he did. I highly recommend it

The new Async and await keywords allow you to orchestrate concurrency in your applications. They don't actually introduce any concurrency in to your application.

TPL and more specifically Task is one way you can use to actually perform operations concurrently. The new async and await keyword allow you to compose these concurrent operations in a "synchronous" or "linear" fashion.

So you can still write a linear flow of control in your programs while the actual computing may or may not happen concurrently. When computation does happen concurrently, await and async allow you to compose these operations.

Foovanadil
This doesn't actually *say* anything, does it? Let me rephrase: How does this answer the question posed?
Lasse V. Karlsen
The question is "How does C# 5.0's async-await feature differ from the TPL?" My answer is appropriate for that question IMO
Foovanadil