views:

487

answers:

2

Lets say I wanted to scrape a webpage, and extract some data. I'd most likely write something like this:

let getAllHyperlinks(url:string) =
    async {  let req = WebRequest.Create(url)
             let! rsp = req.GetResponseAsync()
             use stream = rsp.GetResponseStream()             // depends on rsp
             use reader = new System.IO.StreamReader(stream)  // depends on stream
             let! data = reader.AsyncReadToEnd()              // depends on reader
             return extractAllUrls(data) }                    // depends on data

The let! tells F# to execute the code in another thread, then bind the result to a variable, and continue processing. The sample above uses two let statements: one to get the response, and one to read all the data, so it spawns at least two threads (please correct me if I'm wrong).

Although the workflow above spawns several threads, the order of execution is serial because each item in the workflow depends on the previous item. Its not really possible to evaluate any items further down the workflow until the other threads return.

Is there any benefit to having more than one let! in the code above?

If not, how would this code need to change to take advantage of multiple let! statements?

+6  A: 

The key is we are not spawning any new threads. During the whole course of the workflow, there are 1 or 0 active threads being consumed from the ThreadPool. (An exception, up until the first '!', the code runs on the user thread that did an Async.Run.) "let!" lets go of a thread while the Async operation is at sea, and then picks up a thread from the ThreadPool when the operation returns. The (performance) advantage is less pressure against the ThreadPool (and of course the major user advantage is the simple programming model - a million times better than all that BeginFoo/EndFoo/callback stuff you otherwise write).

See also http://cs.hubfs.net/forums/thread/8262.aspx

Brian
Ok, so let! doesn't spawn multiple threads, it just releases the thread handle back to the threadpool :) I imagine this comes with a small amount of overhead, so I probably wouldn't "let!" each and every line. Are there any rules for placing "let!" in the most optimal locations?
Juliet
Place let! on every line where you are going to do an async call that will take a while and doesn't need a thread while it's gone (reading from network or file stream). So both"let!"s in your example are "good".
Brian
If you are going to run many copies of the workflow, any 'overhead' of the "let!" will be dwarfed by the return you get from keeping the CPU active without having to spawn extra threads.
Brian
Thanks, Brian, your replies are helpful and I appreciate it :)
Juliet
+1  A: 

I was writing an answer but Brian beat me to it. I fully agree with him.

I'd like to add that if you want to parallelize synchronous code, the right tool is PLINQ, not async workflows, as Don Syme explains.

Mauricio Scheffer