views:

639

answers:

5

I was trying to determine the overhead of the header on a .NET array (in a 32-bit process) using this code:

long bytes1 = GC.GetTotalMemory(false);
object[] array = new object[10000];
    for (int i = 0; i < 10000; i++)
        array[i] = new int[1];
long bytes2 = GC.GetTotalMemory(false);
array[0] = null; // ensure no garbage collection before this point

Console.WriteLine(bytes2 - bytes1);
// Calculate array overhead in bytes by subtracting the size of 
// the array elements (40000 for object[10000] and 4 for each 
// array), and dividing by the number of arrays (10001)
Console.WriteLine("Array overhead: {0:0.000}", 
                  ((double)(bytes2 - bytes1) - 40000) / 10001 - 4);
Console.Write("Press any key to continue...");
Console.ReadKey();

The result was

    204800
    Array overhead: 12.478

In a 32-bit process, object[1] should be the same size as int[1], but in fact the overhead jumps by 3.28 bytes to

    237568
    Array overhead: 15.755

Anyone know why?

(By the way, if anyone's curious, the overhead for non-array objects, e.g. (object)i in the loop above, is about 8 bytes (8.384). I heard it's 16 bytes in 64-bit processes.)

+1  A: 

Because heap management (since you deal with GetTotalMemory) can only allocate rather large blocks, which latter are allocated by smaller chunks for programmer purposes by CLR.

Dewfy
Isn't memory allocated in 4 KB or 8 KB pages? In that case the maximum error is 2-4% in this example. I tried increasing the test size to 100,000 arrays, but this made little difference: int[1] overhead changed to 11.924 and object[1] overhead was 15.938: still a 3 byte difference.
Qwertie
@Qwertie: Actually the CLR allocates in rather big chunks from the OS. On 32 bit segments are typically around 16 MB. These segments are then used as storage for the managed heap.
Brian Rasmussen
Oops, that's now a 4 byte difference! A more easy number to explain...
Qwertie
It seems to me GetTotalMemory works on a granularity of about 8 KB.
Qwertie
+2  A: 

I think you are making some faulty assumptions while measuring, as the memory allocation (via GetTotalMemory) during your loop may be different than the actual required memory for just the arrays - the memory may be allocated in larger blocks, there may be other objects in memory that are reclaimed during the loop, etc.

Here's some info for you on array overhead:

Philip Rieck
Faulty assumptions such as...?
Qwertie
Faulty assumptions such as - GetTotalMemory is only increased by the array allocations in your loop, and only by exacly the amount required.
Philip Rieck
+12  A: 

Array is a reference type. All reference types carry two additional word fields. The type reference and a SyncBlock index field, which among other things is used to implement locks in the CLR. So the type overhead on reference types is 8 bytes on 32 bit. On top of that the array itself also stores the length which is another 4 bytes. This brings the total overhead to 12 bytes.

And I just learned from Jon Skeet's answer, arrays of reference types has an additional 4 bytes overhead. This can be confirmed using WinDbg. It turns out that the additional word is another type reference for the type stored in the array. All arrays of reference types are stored internally as object[], with the additional reference to the type object of the actual type. So a string[] is really just an object[] with an additional type reference to the type string. For details please see below.

Values stored in arrays: Arrays of reference types hold references to objects, so each entry in the array is the size of a reference (i.e. 4 bytes on 32 bit). Arrays of value types store the values inline and thus each element will take up the size of the type in question.

This question may also be of interest: http://stackoverflow.com/questions/1508215/c-listdouble-size-vs-double-size/1508232#1508232

Gory Details

Consider the following code

var strings = new string[1];
var ints = new int[1];

strings[0] = "hello world";
ints[0] = 42;

Attaching WinDbg shows the following:

First let's take a look at the value type array.

0:000> !dumparray -details 017e2acc 
Name: System.Int32[]
MethodTable: 63b9aa40
EEClass: 6395b4d4
Size: 16(0x10) bytes
Array: Rank 1, Number of elements 1, Type Int32
Element Methodtable: 63b9aaf0
[0] 017e2ad4
    Name: System.Int32
    MethodTable 63b9aaf0
    EEClass: 6395b548
    Size: 12(0xc) bytes
     (C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
    Fields:
          MT    Field   Offset                 Type VT     Attr    Value Name
    63b9aaf0  40003f0        0         System.Int32  1 instance       42 m_value <=== Our value

0:000> !objsize 017e2acc 
sizeof(017e2acc) =           16 (        0x10) bytes (System.Int32[])

0:000> dd 017e2acc -0x4
017e2ac8  00000000 63b9aa40 00000001 0000002a <=== That's the value

First we dump the array and the one element with value of 42. As can be seen the size is 16 bytes. That is 4 bytes for the int32 value itself, 8 bytes for regular reference type overhead and another 4 bytes for the length of the array.

The raw dump shows the SyncBlock, the method table for int[], the length, and the value of 42 (2a in hex). Notice that the SyncBlock is located just in front of the object reference.

Next, let's look at the string[] to find out what the additional word is used for.

0:000> !dumparray -details 017e2ab8 
Name: System.String[]
MethodTable: 63b74ed0
EEClass: 6395a8a0
Size: 20(0x14) bytes
Array: Rank 1, Number of elements 1, Type CLASS
Element Methodtable: 63b988a4
[0] 017e2a90
    Name: System.String
    MethodTable: 63b988a4
    EEClass: 6395a498
    Size: 40(0x28) bytes <=== Size of the string
     (C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
    String:     hello world    
    Fields:
          MT    Field   Offset                 Type VT     Attr    Value Name
    63b9aaf0  4000096        4         System.Int32  1 instance       12 m_arrayLength
    63b9aaf0  4000097        8         System.Int32  1 instance       11 m_stringLength
    63b99584  4000098        c          System.Char  1 instance       68 m_firstChar
    63b988a4  4000099       10        System.String  0   shared   static Empty
    >> Domain:Value  00226438:017e1198 <<
    63b994d4  400009a       14        System.Char[]  0   shared   static WhitespaceChars
    >> Domain:Value  00226438:017e1760 <<

0:000> !objsize 017e2ab8 
sizeof(017e2ab8) =           60 (        0x3c) bytes (System.Object[]) <=== Notice the underlying type of the string[]

0:000> dd 017e2ab8 -0x4
017e2ab4  00000000 63b74ed0 00000001 63b988a4 <=== Method table for string
017e2ac4  017e2a90 <=== Address of the string in memory

0:000> !dumpmt 63b988a4
EEClass: 6395a498
Module: 63931000
Name: System.String
mdToken: 02000024  (C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
BaseSize: 0x10
ComponentSize: 0x2
Number of IFaces in IFaceMap: 7
Slots in VTable: 196

First we dump the array and the string. Next we dump the size of the string[]. Notice that WinDbg lists the type as System.Object[] here. The object size in this case includes the string itself, so the total size is the 20 from the array plus the 40 for the string.

By dumping the raw bytes of the instance we can see the following: First we have the SyncBlock, then follows the method table for object[], then the length of the array. After that we find the additional 4 bytes with the reference to the method table for string. This can be verified by the dumpmt command as shown above. Finally we find the single reference to the actual string instance.

In conclusion

The overhead for arrays can be broken down as follows (on 32 bit that is)

  • 4 bytes SyncBlock
  • 4 bytes for Method table (type reference) for the array itself
  • 4 bytes for Length of array
  • Arrays of reference types adds another 4 bytes to hold the method table of the actual element type (reference type arrays are object[] under the hood)

I.e. the overhead is 12 bytes for value type arrays and 16 bytes for reference type arrays.

Brian Rasmussen
object[1] and int[1] are both reference types.
Qwertie
I think you mean "_two_ additional fields"
Henk Holterman
@Henk: Typo. Thanks!
Brian Rasmussen
@Qwertie: Yes arrays are reference types. I was just pointing out the somewhat surprising fact that object takes up additional 4 bytes. I have removed the comment as it doesn't have anything to do with arrays.
Brian Rasmussen
I read at http://codeproject.com/KB/dotnet/arrays.aspx that the additional 4 bytes are for an "element type field". I assume it is used to save a single indirection, since the CLR designers could have included that field within the virtual function table of each array type instead. In other words, I think they could have stored the field once per reference array type, but instead they chose to store a copy of the field in every array.
Qwertie
@Qwertie: That is already accounted for (the "type reference" in my answer). All reference types has this field.
Brian Rasmussen
@Brian: Looks like we were both expanding our answers at the same time. Could you check that mine looks correct to you? I've also given some guesswork as to why this is the case :)
Jon Skeet
@Jon, I included a comment about the SyncBlock field on your answer. As far as I can tell our stories are pretty much in sync.
Brian Rasmussen
A: 

I'm sorry for the offtopic but I found interesting info on memory overheading just today morning.

We have a project which operates huge amount of data (up to 2GB). As the major storage we use Dictionary<T,T>. Thousands of dictionaries are created actually. After change it to List<T> for keys and List<T> for values (we implemented IDictionary<T,T> ourselves) the memory usage decreased on about 30-40%.

Why?

Vasiliy Borovyak
If you have a look at the Dictionary class in Reflector, it uses an internal array of Entry<TKey, TValue> structs. Each struct, as well as the key/value objects, has 'hashcode' and 'next' ints, used by the Dictionary implementation. If your own IDictionary class has no additional data per-entry, those two ints are probably where your additional memory usage comes from
thecoop
A dictionary internally uses a structure called Entry, which uses extra storage to hold "hashCode" and "next" fields for each entry. I figure these are probably used to resolve hash collisions.
Qwertie
There is also an array of integer "buckets" that is the same length as the array of Entries, adding 4 bytes more overhead. Thus, I would expect a List of KeyValuePair<A,B> to be 12 bytes smaller per entry (typically) compared to a Dictionary<A,B>, because of the three integers hashCode, next and the "bucket" value.
Qwertie
Well, after further improvement of our `LightDictionary` we made it work as fast as standard Dictionary, but using 3 times less memory. So from now on our data takes 700MB of RAM instead of 2GB. :)I just wanted to thank you guys. I'm happy.
Vasiliy Borovyak
LightDictionary? How do you save so much space while still being able to look up something quickly? Or did you really just need a list?
Qwertie
The matter is we use dictionaries to store uint values **only**. So the hash is 100% overhead in our case, and the structure we use to store data is two simple arrays.We created our own BinarySearch which works much faster than Array.BinraySearch (I dunno why this one is so slow). The insertions into LightDictionary is as fast as into standard one. New array (1 uint larger) is allocated each time the value inserted.The number of keys in our data is never greater than 10000, but the number of created dictionaries is tremendous.
Vasiliy Borovyak
+13  A: 

Here's a slightly neater (IMO) short but complete program to demonstrate the same thing:

using System;

class Test
{
    const int Size = 100000;

    static void Main()
    {
        object[] array = new object[Size];
        long initialMemory = GC.GetTotalMemory(true);
        for (int i = 0; i < Size; i++)
        {
            array[i] = new string[0];
        }
        long finalMemory = GC.GetTotalMemory(true);
        GC.KeepAlive(array);

        long total = finalMemory - initialMemory;

        Console.WriteLine("Size of each element: {0:0.000} bytes",
                          ((double)total) / Size);
    }
}

But I get the same results - the overhead for any reference type array is 16 bytes, whereas the overhead for any value type array is 12 bytes. I'm still trying to work out why that is, with the help of the CLI spec. Don't forget that reference type arrays are covariant, which may be relevant...

EDIT: With the help of cordbg, I can confirm Brian's answer - the type pointer of a reference-type array is the same regardless of the actual element type. Presumably there's some funkiness in object.GetType() (which is non-virtual, remember) to account for this.

So, with code of:

object[] x = new object[1];
string[] y = new string[1];
int[] z = new int[1];
z[0] = 0x12345678;
lock(z) {}

We end up with something like the following:

Variables:
x=(0x1f228c8) <System.Object[]>
y=(0x1f228dc) <System.String[]>
z=(0x1f228f0) <System.Int32[]>

Memory:
0x1f228c4: 00000000 003284dc 00000001 00326d54 00000000 // Data for x
0x1f228d8: 00000000 003284dc 00000001 00329134 00000000 // Data for y
0x1f228ec: 00000000 00d443fc 00000001 12345678 // Data for z

Note that I've dumped the memory 1 word before the value of the variable itself.

For x and y, the values are:

  • The sync block, used for locking the hash code (or a thin lock - see Brian's comment)
  • Type pointer
  • Size of array
  • Element type pointer
  • Null reference (first element)

For z, the values are:

  • Sync block
  • Type pointer
  • Size of array
  • 0x12345678 (first element)

Different value type arrays (byte[], int[] etc) end up with different type pointers, whereas all reference type arrays use the same type pointer, but have a different element type pointer. The element type pointer is the same value as you'd find as the type pointer for an object of that type. So if we looked at a string object's memory in the above run, it would have a type pointer of 0x00329134.

The word before the type pointer certainly has something to do with either the monitor or the hash code: calling GetHashCode() populates that bit of memory, and I believe the default object.GetHashCode() obtains a sync block to ensure hash code uniqueness for the lifetime of the object. However, just doing lock(x){} didn't do anything, which surprised me...

All of this is only valid for "vector" types, by the way - in the CLR, a "vector" type is a single-dimensional array with a lower-bound of 0. Other arrays will have a different layout - for one thing, they'd need the lower bound stored...

So far this has been experimentation, but here's the guesswork - the reason for the system being implemented the way it has. From here on, I really am just guessing.

  • All object[] arrays can share the same JIT code. They're going to behave the same way in terms of memory allocation, array access, Length property and (importantly) the layout of references for the GC. Compare that with value type arrays, where different value types may have different GC "footprints" (e.g. one might have a byte and then a reference, others will have no references at all, etc).
  • Every time you assign a value within an object[] the runtime needs to check that it's valid. It needs to check that the type of the object whose reference you're using for the new element value is compatible with the element type of the array. For instance:

    object[] x = new object[1];
    object[] y = new string[1];
    x[0] = new object(); // Valid
    y[0] = new object(); // Invalid - will throw an exception
    

This is the covariance I mentioned earlier. Now given that this is going to happen for every single assignment, it makes sense to reduce the number of indirections. In particular, I suspect you don't really want to blow the cache by having to go to the type object for each assigment to get the element type. I suspect (and my x86 assembly isn't good enough to verify this) that the test is something like:

  • Is the value to be copied a null reference? If so, that's fine. (Done.)
  • Fetch the type pointer of the object the reference points at.
  • Is that type pointer the same as the element type pointer (simple binary equality check)? If so, that's fine. (Done.)
  • Is that type pointer assignment-compatible with the element type pointer? (Much more complicated check, with inheritance and interfaces involved.) If so, that's fine - otherwise, throw an exception.

If we can terminate the search in the first three steps, there's not a lot of indirection - which is good for something that's going to happen as often as array assignments. None of this needs to happen for value type assignments, because that's statically verifiable.

So, that's why I believe reference type arrays are slightly bigger than value type arrays.

Great question - really interesting to delve into it :)

Jon Skeet
Philip's first link (http://www.codeproject.com/KB/dotnet/arrays.aspx) has an answer: "Reference types also have an element type field that precedes the data; it does seem redundant since the array's method table could instead extract the necessary type information. The presence of the element type as a field does allow the elements type information to be extracted quickly without an virtual call indirection and function call, which is important for features like array covariance."
Qwertie
Jon, I believe I found the missing four bytes. Please see my answer. If you want I can add the necessary dumps from WinDbg to illustrate.
Brian Rasmussen
And btw +1 for highlighting this fact.
Brian Rasmussen
The bytes you see in front of the instance is the SyncBlock entry. However, it is a bit more complicated than that, as this field is used for both locking and hash code as you mention. CLR 2.0 introduced an optimization called thin locks. As I understand it the thin lock can be stored directly in the SyncBlock field provided that it isn't used for the hash code and that the lock is not contented. If either of these occur, the lock is "promoted" to a real SyncBlock lock. The upcoming Advanced .NET debugging book (http://my.safaribooksonline.com/0321589637) has additional details.
Brian Rasmussen
Ah... and the thin lock is wiped out after the lock is exited. That makes sense - and I've now verified that indeed, if you dump the memory while *in* a lock statement, that word gets a value.
Jon Skeet