ansaurus

Question

Does List<T> create garbage in C# in foreach

Answer 1

+21 A:

List<T>'s GetEnumerator method actually is quite efficient.

When you loop through the elements of a List<T>, it calls GetEnumerator. This, in turn, generates an internal struct which holds a reference to the original list, an index, and a version ID to track for changes in the list.

However, since a struct is being used, it's really not creating "garbage" that the GC will ever deal with.

As for "create a new enumerator for each derived class" - .NET generics works differently than C++ templates. In .NET, the List<T> class (and it's internal Enumerator<T> struct) is defined one time, and usable for any T. When used, a generic type for that specific type of T is required, but this is only the type information for that newly created type, and quite small in general. This differs from C++ templates, for example, where each type used is created at compile time, and "built in" to the executable.

In .NET, the executable specifies the definition for List<T>, not List<int>, List<Entity2D>, etc...

Reed Copsey 2010-07-15 19:21:35

+1. They also differ from Java's generics, which are purely a compile-time feature (all occurrences of type `T` are just regarded as `object`)

Adam Robinson 2010-07-15 19:29:59

Nice point. .NET generics are very nice in this regard.

Reed Copsey 2010-07-15 19:31:23

Thanks Reed that clears things up

Chris Watts 2010-07-15 20:05:47

Answer 2

+1 A:

Regardless of whether it's a List<Entity>, List<Entity2D>, or List<IEntity>, GetEnumerator will be called once per foreach. Further, it is irrelevant whether e.g. List<Entity> contains instances of Entity2D. An IEnumerable<T>'s implementation of GetEnumerator may create reference objects which will be collected. As Reed noted, List<T> in MS .NET avoids this by using only value types.

Matthew Flaschen 2010-07-15 19:24:42

Unless you create your own enumerator type (meaning you create a reference type that implements `IEnumerator<T>`) or you case the object to `IEnumerable<T>`, then the enumerator *is* a value type. No additional references are created.

Adam Robinson 2010-07-15 19:31:18

@Adam, yes, the question mentions `IEnumerable<T>` in general.

Matthew Flaschen 2010-07-15 19:35:31

@Matthew: Yes, but his specific example mentioned a `List<T>`.

Adam Robinson 2010-07-15 19:45:30

Answer 3

+4 A:

I think you may be interested in this article which explains why List(T) will not create "garbage", as opposed to Collection(T):

Now, here comes the tricky part. Rumor has it that many of the types in System.Collections.Generic will not allocate an enumerator when using foreach. List's GetEnumerator, for example, returns a struct, which will just sit on the stack. Look for yourself with .NET Reflector, if you don't believe me. To prove to myself that a foreach over a List doesn't cause any heap allocations, I changed entities to be a List, did the exact same foreach loop, and ran the profiler. No enumerator!

[...]

However, there is definitely a caveat to the above. Foreach loops over Lists can still generate garbage. [Casting List to IEnumerable] Even though we're still doing a foreach over a List, when the list is cast to an interface, the value type enumerator must be boxed, and placed on the heap.

Michael Stum 2010-07-15 19:26:26

Thanks I saw that article but I didn't think it was specific about having interfaces as T

Chris Watts 2010-07-15 20:02:32

Good point, it only doesn't generate garbage when use the list directly. When you go through the interface it create garbage.

Gamlor 2010-07-16 11:48:50

Answer 4

+2 A:

An interesting note: as Reed Copsey pointed out, the List<T>.Enumerator type is actually a struct. This is both good and horrible.

It's good in the sense that calling foreach on a List<T> actually doesn't create garbage, as no new reference type objects are allocated for the garbage collector to worry about.

It's horrible in the sense that suddenly the return value of GetEnumerator is a value type, against almost every .NET developer's intuition (it is generally expected that GetEnumerator will return a non-descript IEnumerator<T>, as this is what is guaranteed by the IEnumerable<T> contract; List<T> gets around this by explicitly implementing IEnumerable<T>.GetEnumerator and publicly exposing a more specific implementation of IEnumerator<T> which happens to be a value type).

So any code that, for example, passes a List<T>.Enumerator to a separate method which in turn calls MoveNext on that enumerator object, faces the potential issue of an infinite loop. Like this:

int CountListMembers<T>(List<T> list)
{
    using (var e = list.GetEnumerator())
    {
        int count = 0;
        while (IncrementEnumerator(e, ref count)) { }

        return count;
    }
}

bool IncrementEnumerator<T>(IEnumerator<T> enumerator, ref int count)
{
    if (enumerator.MoveNext())
    {
        ++count;
        return true;
    }

    return false;
}

The above code is very stupid; it's only meant as a trivial example of one scenario in which the return value of List<T>.GetEnumerator can cause highly unintuitive (and potentially disastrous) behavior.

But as I said, it's still kind of good in that it doesn't create garbage ;)

Dan Tao 2010-07-15 19:53:36

+1, though it's also worth noting that the usage of `var` is what kills you, since it types the variable as `List<T>.Enumerator`. If you were to use `IEnumerator<T>`, you wouldn't have this issue, as you'd be passing around a single boxed reference to the original enumerator.

Adam Robinson 2010-07-15 19:57:20

It should be noted that using value type there creates very good optimization opportunities for JIT. If you look at the native code produced, you'll see that it will inline pretty much everything, resulting in a simple loop over array indices; the only extra over an explicit loop you'd write yourself is the "version" check on every iteration to detect modifications in collection.

Pavel Minaev 2010-07-15 20:01:11

@Adam Robinson: Yeah; I guess that's kind of what I mean, though, about developers' expectations for `GetEnumerator`. Plenty of developers would not think twice about using `var` because it's always expected for `GetEnumerator` to return something typed as `IEnumerator<T>`. I'd also note that the issue here could be handled equally well (in fact I'd say better) by having the method accept an `IEnumerable<T>` instead of a `List<T>`.

Dan Tao 2010-07-15 20:05:44

@Dan: Yeah, I don't mean to diminish your point (that the developer would likely use `var`), but just wanted to bring that fact to the forefront. And, you're right, the method should have accepted an `IEnumerable<T>` anyway.

Adam Robinson 2010-07-15 20:10:34

A related posting: http://stackoverflow.com/questions/2939024/c-4-0-dynamic-and-foreach-statement/2939180#2939180

Eric Lippert 2010-07-15 20:35:04

ansaurus

tags:

views:

answers:

Does List<T> create garbage in C# in foreach

related questions