views:

839

answers:

4

Extract from CLR via C# on Boxing / Unboxing value types ...

On Boxing: If the nullable instance is not null, the CLR takes the value out of the nullable instance and boxes it. In other words a Nullable < Int32 > with a value of 5 is boxed into a boxed-Int32 with a value of 5.

On Unboxing: Unboxing is simply the act of obtaining a reference to the unboxed portion of a boxed object. The problem is that a boxed value type cannot be simply unboxed into a nullable version of that value type because the boxed value doesn't have the boolean hasValue field in it. So, when unboxing a value type into a nullable version, the CLR must allocate a Nullable < T > object, initialize the hasValue field to true, and set the value field to the same value that is in the boxed value type. This impacts your application performance (memory allocation during unboxing).

Why did the CLR team go through so much trouble for Nullable types ? Why was it not simply boxed into a Nullable < Int32 > in the first place ?

A: 

I guess that is basically what it does. The description given includes your suggestion (ie boxing into a Nullable<T>).

The extra is that it sets the hasValue field after boxing.

André Neves
+8  A: 

I remember this behavior was kind of last minute change. In early betas of .NET 2.0, Nullable<T> was a "normal" value type. Boxing a null valued int? turned it into a boxed int? with a boolean flag. I think the reason they decided to choose the current approach is consistency. Say:

int? test = null;
object obj = test;
if (test != null)
   Console.WriteLine("test is not null");
if (obj != null)
   Console.WriteLine("obj is not null");

In the former approach (box null -> boxed Nullable<T>), you wouldn't get "test is not null" but you'd get "object is not null" which is weird.

Additionally, if they had boxed a nullable value to a boxed-Nullable<T>:

int? val = 42;
object obj = val;

if (obj != null) {
   // Our object is not null, so intuitively it's an `int` value:
   int x = (int)obj; // ...but this would have failed. 
}

Beside that, I believe the current behavior makes perfect sense for scenarios like nullable database values (think SQL-CLR...)


Clarification:

The whole point of providing nullable types is to make it easy to deal with variables that have no meaningful value. They didn't want to provide two distinct, unrelated types. An int? should behaved more or less like a simple int. That's why C# provides lifted operators.

So, when unboxing a value type into a nullable version, the CLR must allocate a Nullable<T> object, initialize the hasValue field to true, and set the value field to the same value that is in the boxed value type. This impacts your application performance (memory allocation during unboxing).

This is not true. The CLR would have to allocates memory on stack to hold the variable whether or not it's nullable. There's not a performance issue to allocate space for an extra boolean variable.

Mehrdad Afshari
From what I understand, in the current implementation (Step 1) if test is null, the CLR does not box anything and returns null. (Step 2) If the nullable instance is not null, it boxes it into a boxed-int32. Doesn't Step 1 solve the "obj is not null" problem? Why did they have to do step 2 ? Sorry, but I seem to be missing something.
Preets
Preets: You mean they would box `null` to a null reference and box `int? x = 4;` to `boxed-Nullable<int>`?
Mehrdad Afshari
Umm.. yeah.. is that not possible ?
Preets
Preets: If they had done that, you couldn't unbox it directly to an `int`.
Mehrdad Afshari
What's the point though? Boxing a non-null nullable as a Nullable<...> is simply wasting a boolean value, thus (slightly) increasing GC pressure and reducing processor cache for no good reason. The whole idea behind Nullable<...> is that it represents a value type that happens to be able to be null - but that entire extra step is unnecessary for boxed values which can inherently be null anyhow.
Eamon Nerbonne
While I can sort of understand the processor cache argument (I don't think it matters most of the time), I am not sure about the GC pressure argument. Whether you box and int or box a Nullable<int>, the GC still handles it as a single block. Creating it is just an allocation (near-free with the GC), and deleting means mark/sweep/compact is still going to mark, sweep, and compact a block regardless. I can't see any difference in GC load...just a difference in the size of the allocated block. I think the crux of the matter is what Mehrdad stated: "unbox directly to an int".
jrista
+5  A: 

I think it makes sense to box a null value to a null reference. Having a boxed value saying "I know I would be an Int32 if I had a value, but I don't" seems unintuitive to me. Better to go from the value type version of "not a value" (a value with HasValue as false) to the reference type version of "not a value" (a null reference).

I believe this change was made on the feedback of the community, btw.

This also allows an interesting use of as even for value types:

object mightBeADouble = GetMyValue();

double? unboxed = mightBeADouble as double?;
if (unboxed != null)
{
    ...
}

This is more consistent with the way "uncertain conversions" are handled with reference types, than the previous:

object mightBeADouble = GetMyValue();

if (mightBeADouble is double)
{
    double unboxed = (double) mightBeADouble;
    ...
}

(It may also perform better, as there's only a single execution time type check.)

Jon Skeet
Thanks for the info with `as`, that's pretty cool!
Lucero
+2  A: 

A thing that you gain via this behavior is that the boxed version implements all interfaces supported by the underlying type. (The goal is to make Nullable<int> appear the same as int for all practical purposes.) Boxing to a boxed-Nullable<int> instead of a boxed-int would prevent this behavior.

From the MSDN Page,

double? d = 44.4;
  object iBoxed = d;
  // Access IConvertible interface implemented by double.
  IConvertible ic = (IConvertible)iBoxed;
  int i = ic.ToInt32(null);
  string str = ic.ToString();

Also getting the int from a boxed version of a Nullable<int> is straightforward - Usually you can't unbox to a type other than the original src type.

float f = 1.5f;
object boxed_float = f;
int int_value = (int) boxed_float; // will blow up. Cannot unbox a float to an int, you *must* unbox to a float first.

float? nullableFloat = 1.4f;
boxed_float = nullableFloat;
float fValue = (float) boxed_float;  // can unbox a float? to a float    Console.WriteLine(fValue);

Here you do not have to know if the original version was an int or a Nullable version of it. (+ you get some perf too ; save space of storing the the hasValue boolean as well in the boxed object)

Gishu