views:

37

answers:

2

I this MSDN Magazine article, the author states (emphasis mine):

Note that boxing always creates a new object and copies the unboxed value's bits to the object. On the other hand, unboxing simply returns a pointer to the data within a boxed object: no memory copy occurs. However, it is commonly the case that your code will cause the data pointed to by the unboxed reference to be copied anyway.

I'm confused by the sentence I've bolded and the sentence that follows it. From everything else I've read, including this MSDN page, I've never before heard that unboxing just returns a pointer to the value on the heap. I was under the impression that unboxing would result in you having a variable containing a copy of the value on the stack, just as you began with. After all, if my variable contains "a pointer to the value on the heap", then I haven't got a value type, I've got a pointer.

Can someone explain what this means? Was the author on crack? (There is at least one other glaring error in the article). And if this is true, what are the cases where "your code will cause the data pointed to by the unboxed reference to be copied anyway"?

I just noticed that the article is nearly 10 years old, so maybe this is something that changed very early on in the life of .Net.

+1  A: 

Boxing is the act of casting a value-type instance to a reference-type instance (either an object or an interface), and reference types are allocated on the heap.

According to 'C# 4.0 in a Nutshell': " ...unboxing copies the contents of the object back into a value-type instance" and that implies on the stack.

In the article you reference, the author states:

public static void Main() {

   Int32 v = 5;    // Create an unboxed value type variable
   Object o = v;   // o refers to a boxed version of v
   v = 123;        // Changes the unboxed value to 123

   Console.WriteLine(v + ", " + (Int32) o);    // Displays "123, 5"
}

From this code, can you guess how many boxing operations occur? You might be surprised to discover that the answer is three! Let's analyze the code carefully to really understand what's going on. First, an Int32 unboxed value type (v) is created and initialized to 5. Then an Object reference type (o) is created and it wants to point to v. But reference types must always point to objects in the heap, so C# generated the proper IL code to box v and stored the address of the boxed version of v in o. Now 123 is unboxed and the referenced data is copied into the unboxed value type v; this has no effect on the boxed version of v, so the boxed version keeps its value of 5. Note that this example shows how o is unboxed (which returns a pointer to the data in o), and then the data in o is memory copied to the unboxed value type v.

Mitch Wheat
Yes, everything I've read says or implies that the data is copied back to the stack. As for this explanation contradicting itself: yes, and the part of the quote you pasted in bold is actually what I was referring to when I said there was an error in the article. It's absolutely wrong - o is not unboxed into v anywhere in the example.
Charles
+1  A: 

The article is accurate. It however talks about what really goes not, not what the IL looks like that the compiler generates. After all, a .NET program never executes IL, it executes the machine code that's generated from the IL by the JIT compiler.

And the unbox opcode indeed generates code that produces a pointer to the bits on the heap that represents the value type value. The JIT generates a call to a small helper function in the CLR named "JIT_Unbox". clr\src\vm\jithelpers.cpp if you got the SSCLI20 source code. The Object::GetData() function returns the pointer.

From there, the value most commonly first gets copied into a CPU register. Which then may get stored somewhere. It doesn't have to be the stack, it could be a member of a reference type object (the gc heap). Or a static variable (the loader heap). Or it could be pushed on the stack (method call). Or the CPU register could be used as-is when the value is used in an expression.

While debugging, right-click the editor window and choose "Go To Disassembly" to see the machine code.

Hans Passant
I see! Thank you.
Charles