I've got some code that screen scrapes a website (for illustrative purposes only!)
public System.Drawing.Image GetDilbert()
{
var dilbertUrl = new Uri(@"http://dilbert.com");
var request = WebRequest.CreateDefault(dilbertUrl);
string html;
using (var webResponse = request.GetResponse())
using (var receiveStream = webResponse.GetResponseStream())
using (var readStream = new StreamReader(receiveStream, Encoding.UTF8))
html = readStream.ReadToEnd();
var regex = new Regex(@"dyn/str_strip/[0-9/]+/[0-9]*\.strip\.gif");
var match = regex.Match(html);
if (!match.Success) return null;
string s = match.Value;
var groups = match.Groups;
if (groups.Count > 0)
s = groups[groups.Count - 1].ToString(); // the last group is the one we care about
var imageUrl = new Uri(dilbertUrl, s);
var imageRequest = WebRequest.CreateDefault(imageUrl);
using (var imageResponse = imageRequest.GetResponse())
using (var imageStream = imageResponse.GetResponseStream())
{
System.Drawing.Image image_ = System.Drawing.Image.FromStream(imageStream, true /*useEmbeddedColorManagement*/, true /*validateImageData*/);
return (System.Drawing.Image)image_.Clone(); // "You must keep the stream open for the lifetime of the Image."
}
}
Now, I would like to call GetDilbert() asynchronously. The easy way to use a delegate:
Func<System.Drawing.Image> getDilbert;
IAsyncResult BeginGetDilbert(AsyncCallback callback, object state)
{
getDilbert = GetDilbert;
return getDilbert.BeginInvoke(callback, state);
}
System.Drawing.Image EndGetDilbert(IAsyncResult result)
{
return getDilbert.EndInvoke(result);
}
While that certainly works, it isn't very efficient as the delegate thread will spend most of its time waiting for the two I/O operations.
What I would like to do is to call request.BeginGetResponse()
, do the regex match, and then call imageRequest.BeginGetResponse()
. All while using the standard async call pattern and preserving the signatures of BeginGetDilbert() and EndGetDilbert().
I've tried several approaches and haven't been completely satisfied with any of them; this seems to be a royal pain. Hence, the question. :-)
EDIT: Its seems that the approaches using iterators are frowned on by the C# compiler team.
A plea from the compiler team:
Though it is assuredly the case that you CAN use iterators to implement state machines, poor-mans coroutines, and so on, I wish people would not do so.
Please use tools for the purposes for which they were intended. If you want to write state machines, write yourself a library that is designed specifically to solve that general problem and then use it.
Using tools for purposes other than what they were intended for is "clever", and clever is bad; clever is hard for maintenance programemrs to understand, clever is hard to extend, clever is hard to reason about, clever makes people think "out of the box"; there's good stuff in that box.
Going with the Future<>
answer because that stays in C# which is the same as my sample code. Unfortunately, neither the TPL nor F# are officially supported by Microsoft...yet.