views:

1566

answers:

6
int[] myIntegers;
myIntegers = new int[100];

In the above code, is new int[100] generating the array on the heap? From what I've read on CLR via c#, the answer is yes. But what I can't understand, is what happens to the actual int's inside the array. As they are value types, I'd guess they'd have to be boxed, as I can, for example, pass myIntegers to other parts of the program and it'd clutter up the stack if they were left on it all the time. Or am I wrong? I'd guess they'd just be boxed and would live on the heap for as long the array existed.

+8  A: 

Yes the array will be located on the heap.

The ints inside the array will not be boxed. Just because a value type exists on the heap, does not necessarily mean it will be boxed. Boxing will only occur when a value type, such as int, is assigned to a reference of type object.

For example

Does not box:

int i = 42;
myIntegers[0] = 42;

Boxes:

object i = 42;
object[] arr = new object[10];  // no boxing here 
arr[0] = 42;

You may also want to check out Eric's post on this subject:

JaredPar
But I don't get it. Shouldn't value types be allocated on the stack? Or both value and reference types can be allocated both on heap or stack and it's just that they usually are just stored in one place or other?
devoured elysium
@Jorge, a value type with no reference type wrapper / container will live on the stack. However once it's used within a reference type container it will live in the heap. An array is a reference type and hence the memory for the int must be in the heap.
JaredPar
@Jorge: reference types live only in the heap, never on the stack. Contrariwise, it is impossible (in verifiable code) to store a pointer to a stack location into an object of a reference type.
Anton Tykhyy
I think that you meant to assign i to arr[0]. The constant assignment will still cause boxing of "42", but you created i, so you may as well use it ;-)
Marcus Griep
A: 

An array of integers is allocated on the heap, nothing more, nothing less. myIntegers references to the start of the section where the ints are allocated. That reference is located on the stack.

If you have a array of reference type objects, like the Object type, myObjects[], located on the stack, would reference to the bunch of values which reference the objects themselfes.

To summarize, if you pass myIntegers to some functions, you only pass the reference to the place where the real bunch of integers is allocated.

Dykam
A: 

There is no boxing in your example code.

Value types can live on the heap as they do in your array of ints. The array is allocated on the heap and it stores ints, which happen to be value types. The contents of the array are initialized to default(int), which happens to be zero.

Consider a class that contains a value type:


    class HasAnInt
    {
        int i;
    }

    HasAnInt h = new HasAnInt();

Variable h refers to an instance of HasAnInt that lives on the heap. It just happens to contain a value type. That's perfectly okay, 'i' just happens to live on the heap as it's contained in a class. There is no boxing in this example either.

Curt Nichols
+2  A: 

To understand what's happening, here are some facts:

  • Object are always allocated on the heap.
  • The heap only contains objects.
  • Value types are either allocated on the stack, or part of an object on the heap.
  • An array is an object.
  • An array can only contain value types.
  • An object reference is a value type.

So, if you have an array of integers, the array is allocated on the heap and the integers that it contains is part of the array object on the heap. The integers reside inside the array object on the heap, not as separate objects, so they are not boxed.

If you have an array of strings, it's really an array of string references. As references are value types they will be part of the array object on the heap. If you put a string object in the array, you actually put the reference to the string object in the array, and the string is a separate object on the heap.

Guffa
Yes, references behave exactly like value types but I noticed they are usually not called that way, or included in the value types. See for instance (but there are much more like this) http://msdn.microsoft.com/en-us/library/s1ax56ch.aspx
Henk Holterman
@Henk: Yes, you are right that references is not listed among value type variables, but when it comes to how memory is allocated for them they are in every respect value types, and it's very useful to realise that to understand how the memory allocation all fits together. :)
Guffa
+2  A: 

I think at the core of your question lies a misunderstanding about reference and value types. This is something probably every .NET and Java developer struggled with.

An array is just a list of values. If it's an array of a reference type (say a string[]) then the array is a list of references to various string objects on the heap, as a reference is the value of a reference type. Internally, these references are implemented as pointers to an address in memory. If you wish to visualize this, such an array would look like this in memory (on the heap):

[ 00000000, 00000000, 00000000, F8AB56AA ]

This is an array of string that contains 4 references to string objects on the heap (the numbers here are hexadecimal). Currently, only the last string actually points to anything (memory is initialized to all zero's when allocated), this array would basically be the result of this code in C#:

string[] strings = new string[4];
strings[3] = "something"; // the string was allocated at 0xF8AB56AA by the CLR

The above array would be in a 32 bit program. In a 64 bit program, the references would be twice as big (F8AB56AA would be 00000000F8AB56AA).

If you have an array of value types (say an int[]) then the array is a list of integers, as the value of a value type is the value itself (hence the name). The visualization of such an array would be this:

[ 00000000, 45FF32BB, 00000000, 00000000 ]

This is an array of 4 integers, where only the second int is assigned a value (to 1174352571, which is the decimal representation of that hexadecimal number) and the rest of the integers would be 0 (like I said, memory is initialized to zero and 00000000 in hexadecimal is 0 in decimal). The code that produced this array would be:

 int[] integers = new int[4];
 integers[1] = 1174352571; // integers[1] = 0x45FF32BB would be valid too

This int[] array would also be stored on the heap.

As another example, the memory of a short[4] array would look like this:

[ 0000, 0000, 0000, 0000 ]

As the value of a short is a 2 byte number.

Where a value type is stored, is just an implementation detail as Eric Lippert explains very well here, not inherent to the differences between value and reference types (which is difference in behavior).

When you pass something to a method (be that a reference type or a value type) then a copy of the value of the type is actually passed to the method. In the case of a reference type, the value is a reference (think of this as a pointer to a piece of memory, although that also is an implementation detail) and in the case of a value type, the value is the thing itself.

// Calling this method creates a copy of the *reference* to the string
// and a copy of the int itself, so copies of the *values*
void SomeMethod(string s, int i){}

Boxing only occurs if you convert a value type to a reference type. This code boxes:

object o = 5;
JulianR
I believe "an implementation detail" should be a font-size: 50px. ;)
Simon Svensson
+8  A: 

Your array is allocated on the heap, and the ints are not boxed.

The source of your confusion is likely because people have said that reference types are allocated on the heap and value types are allocated on the stack. This is not an entirely accurate representation.

All local variables and parameters are allocated on the stack. This includes both value types and reference types. The difference between the two is only what is stored in the variable. Unsurprisingly, for a value type, the value of the type is stored directly in the variable, and for a reference type, the value of the type is stored on the heap, and a reference to this value is what is stored in the variable.

The same holds true for fields. When memory is allocated for an instance of an aggregate type (a class or a struct), it must include storage for each of its instance fields. For reference-type fields, this storage holds just a reference to the value, which would itself be allocated on the heap later. For value-type fields, this storage holds the actual value.

So, given the following types:

class RefType{
    public int    I;
    public string S;
    public long   L;
}

struct ValType{
    public int    I;
    public string S;
    public long   L;
}

The values of each of these types would require 16 bytes of memory (assuming a 32-bit word size). The field I in each case takes 4 bytes to store its value, the field S takes 4 bytes to store its reference, and the field L takes 8 bytes to store its value. So the memory for the value of both RefType and ValType looks like this:

 0 ┌───────────────────┐
   │        I          │
 4 ├───────────────────┤
   │        S          │
 8 ├───────────────────┤
   │        L          │
   │                   │
16 └───────────────────┘

Now if you had three local variables in a function, of types RefType, ValType, and int[], like this:

RefType refType;
ValType valType;
int[]   intArray;

then your stack might look like this:

 0 ┌───────────────────┐
   │     refType       │
 4 ├───────────────────┤
   │     valType       │
   │                   │
   │                   │
   │                   │
20 ├───────────────────┤
   │     intArray      │
24 └───────────────────┘

If you assigned values to these local variables, like so:

refType = new RefType();
refType.I = 100;
refType.S = "refType.S";
refType.L = 0x0123456789ABCDEF;

valType = new ValType();
valType.I = 200;
valType.S = "valType.S";
valType.L = 0x0011223344556677;

intArray = new int[4];
intArray[0] = 300;
intArray[1] = 301;
intArray[2] = 302;
intArray[3] = 303;

Then your stack might look something like this:

 0 ┌───────────────────┐
   │    0x4A963B68     │ -- heap address of `refType`
 4 ├───────────────────┤
   │       200         │ -- value of `valType.I`
   │    0x4A984C10     │ -- heap address of `valType.S`
   │    0x44556677     │ -- low 32-bits of `valType.L`
   │    0x00112233     │ -- high 32-bits of `valType.L`
20 ├───────────────────┤
   │    0x4AA4C288     │ -- heap address of `intArray`
24 └───────────────────┘

Memory at address 0x4A963B68 (value of refType) would be something like:

 0 ┌───────────────────┐
   │       100         │ -- value of `refType.I`
 4 ├───────────────────┤
   │    0x4A984D88     │ -- heap address of `refType.S`
 8 ├───────────────────┤
   │    0x89ABCDEF     │ -- low 32-bits of `refType.L`
   │    0x01234567     │ -- high 32-bits of `refType.L`
16 └───────────────────┘

Memory at address 0x4AA4C288 (value of intArray) would be something like:

 0 ┌───────────────────┐
   │        4          │ -- length of array
 4 ├───────────────────┤
   │       300         │ -- `intArray[0]`
 8 ├───────────────────┤
   │       301         │ -- `intArray[1]`
12 ├───────────────────┤
   │       302         │ -- `intArray[2]`
16 ├───────────────────┤
   │       303         │ -- `intArray[3]`
20 └───────────────────┘

Now if you passed intArray to another function, the value pushed onto the stack would be 0x4AA4C288, the address of the array, not a copy of the array.

P Daddy