views:

193

answers:

5

It is true that generic collections perform better than non-generic collections for value types. (i.e. List vs. ArrayList).

But why is that, other than the boxing-unboxing step? Where are the value type objects stored once added to the collection? In non-generic collections, they'll be boxed and stored on heap, what is different in generics?

+3  A: 

An ArrayList is an local array of references to objects stored in the heap.

A generic List of references types is a local array of references to objects stored in the heap.

A generic List of value types is a local array of those value types.

There are two areas of memory, which most references call "The Stack" and "The Heap". Most people who use those terms have no idea why. ("The Stack" may be a stack, but The Heap almost certainly isn't a heap). I prefer the terms "Over Here" and "Over There". When boxed, the value type data is stored "Over There". When stored in an array (perhaps inside a generic List), the value type data is stored "Over Here". "Over here" is better.

James Curran
"Over here" - "over there"??? What???? And what is a "local array"? This answer is more confusing than anything.
0xA3
@0xA3: A local array is one which is stored in "The Stack". Is it more enlightening phrased that way?
James Curran
OK, then your statement is not correct (if you mean stack in the commonly used sense). Both `ArrayList` and `List<T>` use a System.Array as internal storage which is a reference type and is stored "over there".
0xA3
Huh. I thought the answer was pretty clear. I think this last comment demonstrates exactly why it is confusing to use terms "The Heap" and "The Stack". Indeed, the Array will be allocated on "The Heap", but the real question is where is the memory allocated to store each element of the array? In a `List<T>`, the memory to store the value types is within the memory allocated for the System.Array (i.e. "Over Here"). In an ArrayList each element is just a reference to a boxed value type, so the actual memory to store each value type is elsewhere on "The Heap", i.e. somewhere "Over There".
Dr. Wily's Apprentice
+5  A: 

In generics, such as List<T>, they're still stored on the heap. The difference is that, internally, a List<int> makes a single array of integers, and can store the numbers directly. WIth ArrayList, you end up storing an array of references to boxed integer values.

Reed Copsey
A: 

There are several reasons besides boxing and unboxing, including memory caching and the way they are enumerated to perform their jobs. Check out this post, especially the comments.

Nikki9696
A: 

The performance gains in generics are generally only with regards to value types used with generics compared with value types stored in non-generic equivalents.

This is because with generics value types do not need to be cast to object and stored on the heap (boxed). In fact they can remain on the stack which is more performant.

http://msdn.microsoft.com/en-us/library/ms172181.aspx

TheCodeKing
+5  A: 

The relevant implementation detail is that the underlying storage for a List<T> is a T[]. So for a List<int> the values will be stored in an int[]. The integers are stored in a contiguous chunk of memory, allocated from the garbage collected heap.

What makes it so fast is not just that the integers are not boxed, it is that an int[] works so very well with the CPU cache. When you read the first element, you essentially get the next 15 for free without having to read the slow RAM or secondary cache. This works not nearly so well for a boxed int because it is so large and the extra reference can have poor cache locality. However, the garbage collector really helps to take the sting out of that cost by compacting the heap.

Hans Passant
Ah, the only answer that explicitly says that `List<T>` in fact uses an array as a backing store (and well put too). A `List<T>` is simply a wrapper for an array that adds support for resizing, and in fact almost all collection types come down to arrays, except for special collections such as `LinkedList<T>`.
Allon Guralnek
So an `ArrayList` has a double performance hit. First, you have to dereference the value from the array in order to get the object (boxed value type), and then you have to unbox it.
Jim Mischel
@Hans: But if the backing store for `ArrayList` is `object[]`, then first you have to look up the item in the array, that gives you a reference to the object that contains the int. You have to dereference that to get the object, then you have to unbox it. Compare that to `List`, where all you have to do is grab the item from the `int[]` backing store.
Jim Mischel
@Jim, I think you're right, post amended to point out the extra cost of the dereference.
Hans Passant