I have some high performance file transfer code which I wrote in C# using the Async Programming Model (APM) idiom (eg, BeginRead
/EndRead
). This code reads a file from a local disk and writes it to a socket.
For best performance on modern hardware, it's important to keep more than one outstanding I/O operation in flight whenever possible. Thus, I post several BeginRead
operations on the file, then when one completes, I call a BeginSend
on the socket, and when that completes I do another BeginRead
on the file. The details are a bit more complicated than that but at the high level that's the idea.
I've got the APM-based code working, but it's very hard to follow and probably has subtle concurrency bugs. I'd love to use TPL for this instead. I figured Task.Factory.FromAsync
would just about do it, but there's a catch.
All of the I/O samples I've seen (most particularly the StreamExtensions
class in the Parallel Extensions Extras) assume one read followed by one write. This won't perform the way I need.
I can't use something simple like Parallel.ForEach
or the Extras extension Task.Factory.Iterate
because the async I/O tasks don't spend much time on a worker thread, so Parallel just starts up another task, resulting in potentially dozens or hundreds of pending I/O operations; way too much! You can work around that by Wait
ing on your tasks, but that causes creation of an event handle (a kernel object), and a blocking wait on a task wait handle, which ties up a worker thread. My APM-based implementation avoids both of those things.
I've been playing around with different ways to keep multiple read/write operations in flight, and I've managed to do so using continuations that call a method that creates another task, but it feels awkward, and definitely doesn't feel like idiomatic TPL.
Has anyone else grappled with an issue like this with the TPL? Any suggestions?