views:

434

answers:

4

When you use the new C# collection initialization syntax:

string[] sarray = new[] { "A", "B", "C", "D" };

does the compiler avoid initializing each array slot to the default value, or is it equivalent to:

string[] sarray = new string[4];  // all slots initialized to null
sarray[0] = "A";
sarray[1] = "B";
sarray[2] = "C";
sarray[3] = "D";
+12  A: 

The compiler still uses the newarr IL instruction, so the CLR will still initialize the array.

Collection initialization is just compiler magic - the CLR doesn't know anything about it, so it'll still assume it has to do sanity clearance.

However, this should be really, really quick - it's just wiping memory. I doubt it's a significant overhead in many situations.

Jon Skeet
Interesting. I wonder if this 'memory wipe' approach to array initialization is one of the reasons why structs don't support explicit default constructors or member initializers. It would complicate array initialization.
LBushkin
Yes, that's a lot of it. In fact, structs in IL *do* support parameterless constructors, but they'll only be called in certain situations.
Jon Skeet
See http://msmvps.com/blogs/jon_skeet/archive/2008/12/10/value-types-and-parameterless-constructors.aspx for more info.
Jon Skeet
"I doubt it's a significant overhead in many situations" : that's especially true in that case, because you typically use the collection initialization syntax only for small arrays...
Thomas Levesque
+7  A: 

Quick test:

        string[] arr1 =
        {
            "A","B","C","D"
        };
        arr1.GetHashCode();

        string[] arr2 = new string[4];
        arr2[0] = "A";
        arr2[1] = "B";
        arr2[2] = "C";
        arr2[3] = "D";

        arr2.GetHashCode();

results in this IL (note, they're both identical)

  IL_0002:  newarr     [mscorlib]System.String
  IL_0007:  stloc.2
  IL_0008:  ldloc.2
  IL_0009:  ldc.i4.0
  IL_000a:  ldstr      "A"
  IL_000f:  stelem.ref
  IL_0010:  ldloc.2
  IL_0011:  ldc.i4.1
  IL_0012:  ldstr      "B"
  IL_0017:  stelem.ref
  IL_0018:  ldloc.2
  IL_0019:  ldc.i4.2
  IL_001a:  ldstr      "C"
  IL_001f:  stelem.ref
  IL_0020:  ldloc.2
  IL_0021:  ldc.i4.3
  IL_0022:  ldstr      "D"
  IL_0027:  stelem.ref
  IL_0028:  ldloc.2
  IL_0029:  stloc.0
  IL_002a:  ldloc.0
  IL_002b:  callvirt   instance int32 [mscorlib]System.Object::GetHashCode()
  IL_0030:  pop
  IL_0031:  ldc.i4.4
  IL_0032:  newarr     [mscorlib]System.String
  IL_0037:  stloc.1
  IL_0038:  ldloc.1
  IL_0039:  ldc.i4.0
  IL_003a:  ldstr      "A"
  IL_003f:  stelem.ref
  IL_0040:  ldloc.1
  IL_0041:  ldc.i4.1
  IL_0042:  ldstr      "B"
  IL_0047:  stelem.ref
  IL_0048:  ldloc.1
  IL_0049:  ldc.i4.2
  IL_004a:  ldstr      "C"
  IL_004f:  stelem.ref
  IL_0050:  ldloc.1
  IL_0051:  ldc.i4.3
  IL_0052:  ldstr      "D"
  IL_0057:  stelem.ref
  IL_0058:  ldloc.1
  IL_0059:  callvirt   instance int32 [mscorlib]System.Object::GetHashCode()
BFree
+1 For the test.
Andrew Hare
+1  A: 

I ran a short test on instantianting an array using the syntax you describe and found that instantiating with non-default values took about 2.2 fold longer than instantiantion with default values.

When I switched and instantiated with default values, it takes about the same amount of time.

Indeed, when I looked at the decompile it appears that what happens is the array is initialized, and then is populated with any values that are not the default.

Instantiating with non default values:

            bool[] abPrimes = new[] { 
                true, true
            };
0000007e  mov         edx,2 
00000083  mov         ecx,79114A46h 
00000088  call        FD3006F0 
0000008d  mov         dword ptr [ebp-64h],eax 
00000090  mov         eax,dword ptr [ebp-64h] 
00000093  mov         dword ptr [ebp-54h],eax 
00000096  mov         eax,dword ptr [ebp-54h] 
00000099  cmp         dword ptr [eax+4],0 
0000009d  ja          000000A4 
0000009f  call        76A9A8DC 
000000a4  mov         byte ptr [eax+8],1 
000000a8  mov         eax,dword ptr [ebp-54h] 
000000ab  cmp         dword ptr [eax+4],1 
000000af  ja          000000B6 
000000b1  call        76A9A8DC 
000000b6  mov         byte ptr [eax+9],1 
000000ba  mov         eax,dword ptr [ebp-54h] 
000000bd  mov         dword ptr [ebp-40h],eax

Instantiating with default values:

bool[] abPrimes2 = new[] { 
              false, false
            };
000000c0  mov         edx,2 
000000c5  mov         ecx,79114A46h 
000000ca  call        FD3006F0 
000000cf  mov         dword ptr [ebp-68h],eax 
000000d2  mov         eax,dword ptr [ebp-68h] 
000000d5  mov         dword ptr [ebp-54h],eax 
000000d8  mov         eax,dword ptr [ebp-54h] 
000000db  mov         dword ptr [ebp-5Ch],eax
patjbs
A: 

It is not possible to avoid initializing each array slot to the default value, at least in IL level.

String is a CLASS, not a struct.

That means A, B, C, D and the sarray could be stored in any position. A, B, C and D might be get from the Intern pool, that the reference to the object could be dynamic.

But I believe that the JIT could smart enough to reduce half of these overhead.

PS. Premature optimization being the root of all evil.

Dennis Cheung