views:

362

answers:

1

Is the following pattern of multi-threaded calls acceptable to a .Net FileStream?

Several threads calling a method like this:

ulong offset = whatever; // different for each thread
byte[] buffer = new byte[8192];
object state = someState; // unique for each call, hence also for each thread

lock(theFile)
{
    theFile.Seek(whatever, SeekOrigin.Begin);
    IAsyncResult result = theFile.BeginRead(buffer, 0, 8192, AcceptResults, state);
}

if(result.CompletedSynchronously)
{
    // is it required for us to call AcceptResults ourselves in this case?
    // or did BeginRead already call it for us, on this thread or another?
}

Where AcceptResults is:

void AcceptResults(IAsyncResult result)
{
    lock(theFile)
    {
         int bytesRead = theFile.EndRead(result);

         // if we guarantee that the offset of the original call was at least 8192 bytes from
         // the end of the file, and thus all 8192 bytes exist, can the FileStream read still
         // actually read fewer bytes than that?

         // either:
         if(bytesRead != 8192)
         {
             Panic("Page read borked");
         }

         // or:
         // issue a new call to begin read, moving the offsets into the FileStream and
         // the buffer, and decreasing the requested size of the read to whatever remains of the buffer
    }
}

I'm confused because the documentation seems unclear to me. For example, the FileStream class says:

Any public static members of this type are thread safe. Any instance members are not guaranteed to be thread safe.

But the documentation for BeginRead seems to contemplate having multiple read requests in flight:

Multiple simultaneous asynchronous requests render the request completion order uncertain.

Are multiple reads permitted to be in flight or not? Writes? Is this the appropriate way to secure the location of the Position of the stream between the call to Seek and the call to BeginRead? Or does that lock need to be held all the way to EndRead, hence only one read or write in flight at a time?

I understand that the callback will occur on a different thread, and my handling of state, buffer handle that in a way that would permit multiple in flight reads.

Further, does anyone know where in the documentation to find the answers to these questions? Or an article written by someone in the know? I've been searching and can't find anything.

Relevant documentation:

FileStream class
Seek method
BeginRead method
EndRead
IAsyncResult interface

Edit with some new information

A quick check with Reflector shows that BeginRead does capture the stream position into per-call state (some fields of the NativeOverlapped structure). It appears that EndRead doesn't consult the stream position, at least not in any obvious way. This is not conclusive, obviously, because it could be in a non-obvious way or it could be unsupported by the underlying native API.

+1  A: 

Yes, the documentation is sketchy. No clue for better docs, unfortunately.

EDIT: Actually Joe Duffy's book Concurrent Programming on Windows has Chapter 8 APM which explains the async API, IAsyncResult and such (good book and author). Still the fundamental issue here is that MSDN says that instance variables are not thread safe, hence the need for appropriate synchronization.

So you have multiple threads kicking off BeginRead on the same instance of theFile? The BeginRead page does mention this however: "EndRead must be called exactly once for every call to BeginRead. Failing to end a read process before beginning another read can cause undesirable behavior such as deadlock." Also you are calling Seek on theFile object while other threads might be in the middle of executing their BeginRead callbacks. Not safe at all.

Chris O
Why should it matter if one thread is Seek-ing while another is executing its callback? Presumably the callback means that the requested read is complete, right? I'm more concerned about calling Seek in the time between a BeginRead and the matching callback. Unless I am missing something, the code above calls EndRead exactly once for every call to BeginRead, modulo some uncertainty about whether BeginRead invokes its callback when the IAsyncResult it returns is CompletedSynchronously.
Doug McClean
Yes it does have one EndRead for every BeginRead. However, there's no guarantee that the EndRead will get called before another thread starts its BeginRead, the lock doesn't protect for that scenario.
Chris O
Oh, you might be OK here, the BeginRead page does say "On Windows, all I/O operations smaller than 64 KB will complete synchronously for better performance. Asynchronous I/O might hinder performance for buffer sizes smaller than 64 KB." so you *are* guaranteeing that all reads are in fact serialized by your lock.But if all threads are serialized through the locks and the synchronous reads, why bother using the async API at all?
Chris O
Good point, I missed that part about the buffer size. Is there a need to protect for the scenario that EndRead doesn't get called before another thread starts its BeginRead, that's my question. The docs suggest to me that there isn't, or why would they say "multiple simultaneous asynchronous requests render the request completion order uncertain" instead of "multiple simultaneous asynchronous requests are unsupported/result in undefined behavior/will make demons fly out of your nose"?
Doug McClean
Interpreting the MSDN makes demons fly out of my nose ;-) That particular page definitely contradicts itself (MSDN doc bugs are not uncommon), though I would assume that the deadlock statement is the correct one, and the multiple simultaneous async request is in error, but that is only an assumption.
Chris O