views:

275

answers:

1

I have some high performance file transfer code which I wrote in C# using the Async Programming Model (APM) idiom (eg, BeginRead/EndRead). This code reads a file from a local disk and writes it to a socket.

For best performance on modern hardware, it's important to keep more than one outstanding I/O operation in flight whenever possible. Thus, I post several BeginRead operations on the file, then when one completes, I call a BeginSend on the socket, and when that completes I do another BeginRead on the file. The details are a bit more complicated than that but at the high level that's the idea.

I've got the APM-based code working, but it's very hard to follow and probably has subtle concurrency bugs. I'd love to use TPL for this instead. I figured Task.Factory.FromAsync would just about do it, but there's a catch.

All of the I/O samples I've seen (most particularly the StreamExtensions class in the Parallel Extensions Extras) assume one read followed by one write. This won't perform the way I need.

I can't use something simple like Parallel.ForEach or the Extras extension Task.Factory.Iterate because the async I/O tasks don't spend much time on a worker thread, so Parallel just starts up another task, resulting in potentially dozens or hundreds of pending I/O operations; way too much! You can work around that by Waiting on your tasks, but that causes creation of an event handle (a kernel object), and a blocking wait on a task wait handle, which ties up a worker thread. My APM-based implementation avoids both of those things.

I've been playing around with different ways to keep multiple read/write operations in flight, and I've managed to do so using continuations that call a method that creates another task, but it feels awkward, and definitely doesn't feel like idiomatic TPL.

Has anyone else grappled with an issue like this with the TPL? Any suggestions?

+2  A: 

If you're worried about too many threads, you can just set ParallelOptions.MaxDegreeOfParallelism to an acceptable number in your call to Parallel.ForEach.

Gabe
Thanks for the response. That's true I can do that, but the way `Parallel.ForEach` implements that is with a `Wait` on the running tasks. I had hoped to avoid that because:`Wait` uses `ManualResetEventSlim`, which tries a spinlock then reverts to a kernel Event object.Creating a kernel Event object takes many instructions and a switch to kernel mode.Waiting on a kernel Event object requires a switch to kernel mode.Since I didn't have to do either of these things with my APM implementation, I was hoping there was some idiomatic TPL way that would avoid them as well.
anelson
When performing I/O, the last thing you need to worry about is "many instructions and a switch to kernel mode". But since `EndRead` also waits on an Event handle, I don't understand your concern.
Gabe
I tried this in my own test harness, and I was surprised how minimal the penalty is for doing these sorts of waits. It's not the answer I wanted, but it seems right, so I'm accepting this answer.
anelson