This question comes up occasionally but I haven't seen a satisfactory answer.
A typical pattern is (row is a DataRow):
if (row["value"] != DBNull.Value)
{
someObject.Member = row["value"];
}
My first question is which is more efficient (I've flipped the condition):
row["value"] == DBNull.Value; // Or
row["value"] is DBNull; // Or
row["value"].GetType() == typeof(DBNull) // Or... any suggestions?
This indicates that .GetType() should be faster, but maybe the compiler knows a few tricks I don't?
Second question, is it worth caching the value of row["value"] or does the compiler optimize the indexer away anyway?
eg.
object valueHolder;
if (DBNull.Value == (valueHolder = row["value"])) {}
Disclaimers:
- row["value"] exists.
- I don't know the column index of the column (hence the column name lookup)
- I'm asking specifically about checking for DBNull and then assignment (not about premature optimization etc).
Edit:
I benchmarked a few scenarios (time in seconds, 10000000 trials):
row["value"] == DBNull.Value: 00:00:01.5478995
row["value"] is DBNull: 00:00:01.6306578
row["value"].GetType() == typeof(DBNull): 00:00:02.0138757
Object.ReferenceEquals has the same performance as "=="
The most interesting result? If you mismatch the name of the column by case (eg. "Value" instead of "value", it takes roughly ten times longer (for a string):
row["Value"] == DBNull.Value: 00:00:12.2792374
The moral of the story seems to be that if you can't look up a column by it's index, then ensure that the column name you feed to the indexer matches the DataColumn's name exactly.
Caching the value also appears to be nearly twice as fast:
No Caching: 00:00:03.0996622
With Caching: 00:00:01.5659920
So the most efficient method seems to be:
object temp;
string variable;
if (DBNull.Value != (temp = row["value"])
{
variable = temp.ToString();
}
This was a good learning experience.