There's a bit of code which writes data to a MemoryStream object directly into it's data buffer by calling GetBuffer(). It also uses and updates the Position and SetLength() properties appropriately.
This code works properly 99.9999% of the time. Literally. Only every so many 100,000's of iterations it will barf. The specific problem is that the Position property of MemoryStream suddenly returns zero instead of the appropriate value.
However, code was added that checks for the 0 and throws an exception which includes log of the MemoryStream properties like Position and Length in a separate method. Those return the correct value. Further addition of logging within the same method shows that when this rare condition occurs, the Position only has zero inside this particular method.
Okay. Obviously, this must be a threading issue. And most likely a compiler optimization issue.
However, the nature of this software is that it's organized by "tasks" with a scheduler and so any one of several actual O/S thread may run this code at any give time--but never more than one at a time.
So it's my guess that ordinarily it so happens that the same thread keeps getting used for this method and then on a rare occasion a different thread get used. (Just code the idea to test this theory by capturing and comparing the thread id.)
Then due to compiler optimizations, the different thread never gets the correct value. It gets a "stale" value.
Ordinarily in a situation like this I would apply a "volatile" keyword to the variable in question to see if that fixes it. But in this case the variables are inside the MemoryStream object.
Does anyone have any other idea? Or does this mean we have to implement our own MemoryStream object?
Sincerely, Wayne
EDIT: Just ran a test which counts the total number of calls to this method and counts the number of times the ManagedThreadId is different than the last call. It's almost exactly 50% of the time that it switches threads--alternating between them. So my theory above is almost certainly wrong or the error would occur far more often.
EDIT: This bug occurs so rarely that it would take nearly a week to run without the bug before feeling any confidence it's really gone. Instead, it's better to run experiments to confirm precisely the nature of the problem.
EDIT: Locking currently is handled via lock() statements in each of 5 methods that use the MemoryStream.