tags:

views:

144

answers:

1

Introduction

A user reported to me this morning that he was having an issue with inconsistent results (namely, column values sometimes coming out null when they should not be) of some parallel execution code that we provide as part of an internal framework. This code has worked fine in the past and has not been tampered with lately, but it got me to thinking about the following snippet:

Code Sample

                lock (ResultTable)
                {
                    newRow = ResultTable.NewRow();
                }

                newRow["Key"] = currentKey;
                foreach (KeyValuePair<string, object> output in outputs)
                {
                    object resultValue = output.Value;
                    newRow[output.Name] = resultValue != null ? resultValue : DBNull.Value;
                }

                lock (ResultTable)
                {
                    ResultTable.Rows.Add(newRow);
                }

(No guarantees that that compiles, hand-edited to mask proprietery information.)

Explanation

We have this cascading type of locking code other places in our system, and it works fine, but this is the first instance of cascading locking code that I have come across that interacts with ADO .NET. As we all know, members of framework objects are usually not thread safe (which is the case in this situation), but the cascading locking should ensure that we are not reading and writing to ResultTable.Rows concurrently. We are safe, right?

Hypothesis

Well, the cascading lock code does not ensure that we are not reading from or writing to ResultTable.Rows at the same time that we are assigning values to columns in the new row. What if ADO .NET uses some kind of buffer for assigning column values that is not thread safe--even when different object types are involved (DataTable vs. DataRow)?

Has anyone run into anything like this before? I thought I would ask here at StackOverflow before beating my head against this for hours on end :)

Conclusion

Well, the consensus appears to be that changing the cascading lock to a full lock has resolved the issue. That is not the result that I expected, but the full lock version has not produced the issue after many, many, many tests.

The lesson: be wary of cascading locks used on APIs that you do not control. Who knows what may be going on under the covers!

+1  A: 

Allen,

I could not find any specific problems with your approach, not that my testing was exhaustive. Here are some ideas that we stick with (all of our applications are thread centric):

Whenever possible:

[1] Make all data access completely atomic. As data sharing in multi-threaded applications is an excellent place for all kinds of unforeseen thread interaction.

[2] Avoid locking on a type. If the type is not know to be thread safe write a wrapper.

[3] Include structures that allow for the fast identification of threads that are accessing a shared resource. If system performance allows, log this information above the debug level and below usual operation log levels.

[4] Any code, including System.* et.al, not explicitly documented internally as Thread Safe Tested is not Thread Safe. Hearsay and the verbal word of others does not count. Test it and write it down.

Hope this is of some value.

Rusty
Hey, Rusty! Thanks for all the time you spent looking into this! And thanks for the tips too!
Allen E. Scharfenberg
@Allen: When you learn more please post it :)
Rusty
@Rusty Will do!
Allen E. Scharfenberg
@Rusty Added conclusion to the original post above.
Allen E. Scharfenberg