views:

16536

answers:

8

When you create an instance of a class with the new operator, memory gets allocated on the heap. When you create an instance of a struct with the new operator where does the memory get allocated, on the heap or on the stack ?

+1  A: 

As with all value types, structs always go where they were declared.

See this question here for more details on when to use structs. And this question here for some more info on structs.

Edit: I had mistankely answered that they ALWAYS go in the stack. This is incorrect.

Esteban Araya
"structs always go where they were declared", this is a bit misleading confusing. A struct field in a class is always placed into "dynamic memory when an instance of the type is constructed" - Jeff Richter. This may be indirectly on the heap, but is not the same as a normal reference type at all.
Ash
No, I think it's *exactly* right - even though it's not the same as a reference type. The value of a variable lives where it is declared. The value of a reference type variable is a reference, instead of the actual data, that's all.
Jon Skeet
In summary, whenever you create (declare) a value type anywhere in a method it is always created on the stack.
Ash
@Ash (recent): false; if the variable is "captured" (into a lambda/ anon-method) then it is actually a field on a compiler-generated class, i.e. on the heap
Marc Gravell
Jon, you miss my point. The reason this question was first asked is that it is not clear to many developers (me included until I read CLR Via C#) where a struct is allocated if you use the new operator to create it. Saying "structs always go where they were declared" does not is not a clear answer.
Ash
@Ash - also, "iterator blocks" - to the developer they look like methods, but again, all variables become fields, so heap.
Marc Gravell
@Marc, You are technically correct of course, why am I reminded of this anecdote? http://www.jokes2go.com/jokes/6342.html
Ash
Ash: Yes, you need to understand that the new operator isn't the same in C# as in C++. No, that doesn't mean it's accurate to claim that it's always stack-based.
Jon Skeet
Better than giving an incorrect, but *seemingly* useful answer. Actually, there are a lot of implications of the stack vs heap issue with captured variables and iterator blocks - i.e. it wouldn't work without them. So if you consider that useles...
Marc Gravell
Oh, and in terms of your "what's unclear to developers": I've seen many people get confused by the claim that value types are always allocated on the stack. I've had to correct this myth vast numbers of times on newsgroups.
Jon Skeet
Ash: Just to be clear, I do see why you're making the distinction between "where structs live" and "what new(...)" does - but I think the question needs *very* careful answering to avoid confusion. There are lots of nooks and crannies (such as calling the paramerless constructor...)
Jon Skeet
@Jon, Fair point, it's a good discussion that has helped me and I'm sure will help others too.
Ash
@Ash: If I have time, I'll try to write up an answer when I get to work. It's too big a topic to try to cover on the train though :)
Jon Skeet
A: 

Structs get allocated to the stack. Here is a helpful explanation:

Structs

" Additionally, classes when instantiated within .NET allocate memory on the heap or .NET's reserved memory space. Whereas structs yield more efficiency when instantiated due to allocation on the stack. Furthermore, it should be noted that passing parameters within structs are done so by value. "

DaveK
This doesn't cover the case when a struct is part of a class - at which point it lives on the heap, with the rest of the object's data.
Jon Skeet
Yes but it actually focuses on and answers the question being asked. Voted up.
Ash
...while still being incorrect and misleading. Sorry, but there are no short answers to this question - Jeffrey's is the only complete answer.
Marc Gravell
+7  A: 

The memory containing a struct's fields can be allocated on either the stack or the heap depending on the circumstances. If the struct-type variable is a local variable or parameter that is not captured by some anonymous delegate or iterator class, then it will be allocated on the stack. If the variable is part of some class, then it will be allocated within the class on the heap.

If the struct is allocated on the heap, then calling the new operator is not actually necessary to allocate the memory. The only purpose would be to set the field values according to whatever is in the constructor. If the constructor is not called, then all the fields will get their default values (0 or null).

Similarly for structs allocated on the stack, except that C# requires all local variables to be set to some value before they are used, so you have to call either a custom constructor or the default constructor (a constructor that takes no parameters is always available for structs).

Jeffrey L Whitledge
A: 

Pretty much the structs which are considered Value types, are allocated on stack, while objects get allocated on heap, while the object reference (pointer) gets allocated on the stack.

bashmohandes
+85  A: 

Okay, let's see if I can make this any clearer.

Firstly, Ash is right: the question is not about where value type variables are allocated. That's a different question - and one to which the answer isn't just "on the stack". It's more complicated than that (and made even more complicated by C# 2). I have an article on the topic and will expand on it if requested, but let's deal with just the new operator.

Secondly, all of this really depends on what level you're talking about. I'm looking at what the compiler does with the source code, in terms of the IL it creates. It's more than possible that the JIT compiler will do clever things in terms of optimising away quite a lot of "logical" allocation.

Thirdly, I'm ignoring generics, mostly because I don't actually know the answer, and partly because it would complicate things too much.

Finally, all of this is just with the current implementation. The C# spec doesn't specify much of this - it's effectively an implementation detail. There are those who believe that managed code developers really shouldn't care. I'm not sure I'd go that far, but it's worth imagining a world where in fact all local variables live on the heap - which would still conform with the spec.


There are two different situations with the new operator on value types: you can either call a parameterless constructor (e.g. new Guid()) or a parameterful constructor (e.g. new Guid(someString)). These generate significantly different IL. To understand why, you need to compare the C# and CLI specs: according to C#, all value types have a parameterless constructor. According to the CLI spec, no value types have parameterless constructors. (Fetch the constructors of a value type with reflection some time - you won't find a parameterless one.)

It makes sense for C# to treat the "initialize a value with zeroes" as a constructor, because it keeps the language consistent - you can think of new(...) as always calling a constructor. It makes sense for the CLI to think of it differently, as there's no real code to call - and certainly no type-specific code.

It also makes a difference what you're going to do with the value after you've initialized it. The IL used for

Guid localVariable = new Guid(someString);

is different to the IL used for:

myInstanceOrStaticVariable = new Guid(someString);

In addition, if the value is used as an intermediate value, e.g. an argument to a method call, things are slightly different again. To show all these differences, here's a short test program. It doesn't show the difference between static variables and instance variables: the IL would differ between stfld and stsfld, but that's all.

using System;

public class Test
{
    static Guid field;

    static void Main() {}
    static void MethodTakingGuid(Guid guid) {}


    static void ParameterisedCtorAssignToField()
    {
        field = new Guid("");
    }

    static void ParameterisedCtorAssignToLocal()
    {
        Guid local = new Guid("");
        // Force the value to be used
        local.ToString();
    }

    static void ParameterisedCtorCallMethod()
    {
        MethodTakingGuid(new Guid(""));
    }

    static void ParameterlessCtorAssignToField()
    {
        field = new Guid();
    }

    static void ParameterlessCtorAssignToLocal()
    {
        Guid local = new Guid();
        // Force the value to be used
        local.ToString();
    }

    static void ParameterlessCtorCallMethod()
    {
        MethodTakingGuid(new Guid());
    }
}

Here's the IL for the class, excluding irrelevant bits (such as nops):

.class public auto ansi beforefieldinit Test extends [mscorlib]System.Object    
{
    // Removed Test's constructor, Main, and MethodTakingGuid.

    .method private hidebysig static void ParameterisedCtorAssignToField() cil managed
    {
        .maxstack 8
        L_0001: ldstr ""
        L_0006: newobj instance void [mscorlib]System.Guid::.ctor(string)
        L_000b: stsfld valuetype [mscorlib]System.Guid Test::field
        L_0010: ret     
    }

    .method private hidebysig static void ParameterisedCtorAssignToLocal() cil managed
    {
        .maxstack 2
        .locals init ([0] valuetype [mscorlib]System.Guid guid)    
        L_0001: ldloca.s guid    
        L_0003: ldstr ""    
        L_0008: call instance void [mscorlib]System.Guid::.ctor(string)    
        // Removed ToString() call
        L_001c: ret
    }

    .method private hidebysig static void ParameterisedCtorCallMethod() cil  managed    
    {   
        .maxstack 8
        L_0001: ldstr ""
        L_0006: newobj instance void [mscorlib]System.Guid::.ctor(string)
        L_000b: call void Test::MethodTakingGuid(valuetype [mscorlib]System.Guid)
        L_0011: ret     
    }

    .method private hidebysig static void ParameterlessCtorAssignToField() cil managed
    {
        .maxstack 8
        L_0001: ldsflda valuetype [mscorlib]System.Guid Test::field
        L_0006: initobj [mscorlib]System.Guid
        L_000c: ret 
    }

    .method private hidebysig static void ParameterlessCtorAssignToLocal() cil managed
    {
        .maxstack 1
        .locals init ([0] valuetype [mscorlib]System.Guid guid)
        L_0001: ldloca.s guid
        L_0003: initobj [mscorlib]System.Guid
        // Removed ToString() call
        L_0017: ret 
    }

    .method private hidebysig static void ParameterlessCtorCallMethod() cil managed
    {
        .maxstack 1
        .locals init ([0] valuetype [mscorlib]System.Guid guid)    
        L_0001: ldloca.s guid
        L_0003: initobj [mscorlib]System.Guid
        L_0009: ldloc.0 
        L_000a: call void Test::MethodTakingGuid(valuetype [mscorlib]System.Guid)
        L_0010: ret 
    }

    .field private static valuetype [mscorlib]System.Guid field
}

As you can see, there are lots of different instructions used for calling the constructor:

  • newobj: Allocates the value on the stack, calls a parameterised constructor. Used for intermediate values, e.g. for assignment to a field or use as a method argument.
  • call instance: Uses an already-allocated storage location (whether on the stack or not). This is used in the code above for assigning to a local variable. If the same local variable is assigned a value several times using several new calls, it just initializes the data over the top of the old value - it doesn't allocate more stack space each time.
  • initobj: Uses an already-allocated storage location and just wipes the data. This is used for all our parameterless constructor calls, including those which assign to a local variable. For the method call, an intermediate local variable is effectively introduced, and its value wiped by initobj.

I hope this shows how complicated the topic is, while shining a bit of light on it at the same time. In some conceptual senses, every call to new allocates space on the stack - but as we've seen, that isn't what really happens even at the IL level. I'd like to highlight one particular case. Take this method:

void HowManyStackAllocations()
{
    Guid guid = new Guid();
    // [...] Use guid
    guid = new Guid(someBytes);
    // [...] Use guid
    guid = new Guid(someString);
    // [...] Use guid
}

That "logically" has 4 stack allocations - one for the variable, and one for each of the three new calls - but in fact the stack is only allocated once, and then the same storage location is reused.

I've learned a lot in writing this answer - please ask for clarification is any of it is unclear!

Jon Skeet
Where do you get the time? :)
leppie
Nice answer Jon, the clarification of the complexity of this area is very important. I should know by now there is very rarely a black and white answer to most software development questions.
Ash
Jon, the HowManyStackAllocations example code is good. But could you either change it to use a Struct instead of Guid, or add a new Struct example. I think that would then directly address @kedar's original question.
Ash
Guid is already a struct. See http://msdn.microsoft.com/en-us/library/system.guid.aspxI wouldn't have picked a reference type for this question :)
Jon Skeet
Great answer Jon. Really great.
Pablo Santa Cruz
What happens when you have `List<Guid>` and add those 3 to it? That would be 3 allocations (same IL)? But they're kept somewhere magical
Arec Barrwin
+2  A: 

To put it compactly, new is a misnomer for structs, calling new simply calls the constructor. The only storage location for the struct is the location it is defined.

If it is a member variable it is stored directly in whatever it is defined in, if it is a local variable or parameter it is stored on the stack.

Contrast this to classes, which have a reference wherever the struct would have been stored in its entirety, while the reference points somewhere on the heap. (Member within, local/parameter on stack)

It may help to look a bit into C++, where there is not real distinction between class/struct. (There are similar names in the language, but they only refer to the default accessibility of things) When you call new you get a pointer to the heap location, while if you have a non-pointer reference it is stored directly on the stack or within the other object, ala structs in C#.

Guvante
A: 

I'm probably missing something here but why do we care about allocation?

Value types are passed by value ;) and thus can't be mutated at a different scope than where they are defined. To be able to mutate the value you have to add the [ref] keyword.

Reference types are passed by reference and can be mutated.

There are of course immutable reference types strings being the most popular one.

Array layout/initialization: Value types -> zero memory [name,zip][name,zip] Reference types -> zero memory -> null [ref][ref]

A: 

where are static structs stored in memory and static classes stored in memory