views:

214

answers:

4

Correct me if im wrong but doing a foreaver an IEnumerable<T> creates garbage no matter what T is. But I'm wondering if you have a List<T> where T is Entity. Then say there is a derived class in the list like Entity2D. Will it have to create a new enumerator for each derived class? Therefore creating garbage?

Also does having an interface let's say IEntity as T create garbage?

+21  A: 

List<T>'s GetEnumerator method actually is quite efficient.

When you loop through the elements of a List<T>, it calls GetEnumerator. This, in turn, generates an internal struct which holds a reference to the original list, an index, and a version ID to track for changes in the list.

However, since a struct is being used, it's really not creating "garbage" that the GC will ever deal with.


As for "create a new enumerator for each derived class" - .NET generics works differently than C++ templates. In .NET, the List<T> class (and it's internal Enumerator<T> struct) is defined one time, and usable for any T. When used, a generic type for that specific type of T is required, but this is only the type information for that newly created type, and quite small in general. This differs from C++ templates, for example, where each type used is created at compile time, and "built in" to the executable.

In .NET, the executable specifies the definition for List<T>, not List<int>, List<Entity2D>, etc...

Reed Copsey
+1. They also differ from Java's generics, which are purely a compile-time feature (all occurrences of type `T` are just regarded as `object`)
Adam Robinson
Nice point. .NET generics are very nice in this regard.
Reed Copsey
Thanks Reed that clears things up
Chris Watts
+1  A: 

Regardless of whether it's a List<Entity>, List<Entity2D>, or List<IEntity>, GetEnumerator will be called once per foreach. Further, it is irrelevant whether e.g. List<Entity> contains instances of Entity2D. An IEnumerable<T>'s implementation of GetEnumerator may create reference objects which will be collected. As Reed noted, List<T> in MS .NET avoids this by using only value types.

Matthew Flaschen
Unless you create your own enumerator type (meaning you create a reference type that implements `IEnumerator<T>`) or you case the object to `IEnumerable<T>`, then the enumerator *is* a value type. No additional references are created.
Adam Robinson
@Adam, yes, the question mentions `IEnumerable<T>` in general.
Matthew Flaschen
@Matthew: Yes, but his specific example mentioned a `List<T>`.
Adam Robinson
+4  A: 

I think you may be interested in this article which explains why List(T) will not create "garbage", as opposed to Collection(T):

Now, here comes the tricky part. Rumor has it that many of the types in System.Collections.Generic will not allocate an enumerator when using foreach. List's GetEnumerator, for example, returns a struct, which will just sit on the stack. Look for yourself with .NET Reflector, if you don't believe me. To prove to myself that a foreach over a List doesn't cause any heap allocations, I changed entities to be a List, did the exact same foreach loop, and ran the profiler. No enumerator!

[...]

However, there is definitely a caveat to the above. Foreach loops over Lists can still generate garbage. [Casting List to IEnumerable] Even though we're still doing a foreach over a List, when the list is cast to an interface, the value type enumerator must be boxed, and placed on the heap.

Michael Stum
Thanks I saw that article but I didn't think it was specific about having interfaces as T
Chris Watts
Good point, it only doesn't generate garbage when use the list directly. When you go through the interface it create garbage.
Gamlor
+2  A: 

An interesting note: as Reed Copsey pointed out, the List<T>.Enumerator type is actually a struct. This is both good and horrible.

It's good in the sense that calling foreach on a List<T> actually doesn't create garbage, as no new reference type objects are allocated for the garbage collector to worry about.

It's horrible in the sense that suddenly the return value of GetEnumerator is a value type, against almost every .NET developer's intuition (it is generally expected that GetEnumerator will return a non-descript IEnumerator<T>, as this is what is guaranteed by the IEnumerable<T> contract; List<T> gets around this by explicitly implementing IEnumerable<T>.GetEnumerator and publicly exposing a more specific implementation of IEnumerator<T> which happens to be a value type).

So any code that, for example, passes a List<T>.Enumerator to a separate method which in turn calls MoveNext on that enumerator object, faces the potential issue of an infinite loop. Like this:

int CountListMembers<T>(List<T> list)
{
    using (var e = list.GetEnumerator())
    {
        int count = 0;
        while (IncrementEnumerator(e, ref count)) { }

        return count;
    }
}

bool IncrementEnumerator<T>(IEnumerator<T> enumerator, ref int count)
{
    if (enumerator.MoveNext())
    {
        ++count;
        return true;
    }

    return false;
}

The above code is very stupid; it's only meant as a trivial example of one scenario in which the return value of List<T>.GetEnumerator can cause highly unintuitive (and potentially disastrous) behavior.

But as I said, it's still kind of good in that it doesn't create garbage ;)

Dan Tao
+1, though it's also worth noting that the usage of `var` is what kills you, since it types the variable as `List<T>.Enumerator`. If you were to use `IEnumerator<T>`, you wouldn't have this issue, as you'd be passing around a single boxed reference to the original enumerator.
Adam Robinson
It should be noted that using value type there creates very good optimization opportunities for JIT. If you look at the native code produced, you'll see that it will inline pretty much everything, resulting in a simple loop over array indices; the only extra over an explicit loop you'd write yourself is the "version" check on every iteration to detect modifications in collection.
Pavel Minaev
@Adam Robinson: Yeah; I guess that's kind of what I mean, though, about developers' expectations for `GetEnumerator`. Plenty of developers would not think twice about using `var` because it's always expected for `GetEnumerator` to return something typed as `IEnumerator<T>`. I'd also note that the issue here could be handled equally well (in fact I'd say better) by having the method accept an `IEnumerable<T>` instead of a `List<T>`.
Dan Tao
@Dan: Yeah, I don't mean to diminish your point (that the developer would likely use `var`), but just wanted to bring that fact to the forefront. And, you're right, the method should have accepted an `IEnumerable<T>` anyway.
Adam Robinson
A related posting: http://stackoverflow.com/questions/2939024/c-4-0-dynamic-and-foreach-statement/2939180#2939180
Eric Lippert