ansaurus

Question

Why doesn't deferred execution cache iterative values?

Answer 1

+5 A:

It is capturing vowel; try:

foreach(char vowel in "aeiou") {
     char tmp = vowel;
     query = query.Where(c => c != tmp);
}

Jon has a list of related posts here.

As for why... foreach is defined (in ECMA 334v4 §15.8.4) as:

A foreach statement of the form `foreach (V v in x) embedded-statement` is then expanded to:

{
  E e = ((C)(x)).GetEnumerator();
  try {
    V v;
    while (e.MoveNext()) {
      v = (V)(T)e.Current;
      embedded-statement
    }
  }
  finally {
    … // Dispose e
  }
}

Note the V is outside the while - this means it is captured once only for the foreach. I gather that the C# team have, at times, doubted the wisdom of this change (it was inside the while in the C# 1.2 spec, which would have avoided the problem completely). Maybe it'll change back eventually, but for now it is being true to the specification.

Marc Gravell 2009-06-09 12:18:49

I know it works with the temp variable; what I meant to ask is why doesn't it work the way that seems to be the intended way 99% of the time?

GWLlosa 2009-06-09 12:24:11

I know this may seem really dumb but what does the little § mean? I have seen a few times and never been game to ask

Nathan W 2009-06-09 12:37:05

§="section"; http://en.wikipedia.org/wiki/Section_sign

Marc Gravell 2009-06-09 12:44:02

thanks. thought as much, but didn't know how to google search it.

Nathan W 2009-06-09 12:46:36

Is there any documentation or guesstimates as to why they changed it after the C# 1.2 spec?

GWLlosa 2009-06-09 14:34:02

Wild guess - if V is a large value type, each declaration would be a large allocation leading to a performance problem.

David B 2009-06-09 14:41:20

@David B: How would that be any different? In fact, **because** of the glitch we now need two copies of it... in reality, value-types shouldn't be that large anyway.

Marc Gravell 2009-06-09 15:06:04

Answer 2

+4 A:

As Marc says, it's capturing vowel as a variable, rather than the value of vowel in each case. This is almost always undesirable, but it is the specified behaviour (i.e. the compiler is following the language spec). There's a single vowel variable, rather than a "new variable instance" per iteration.

Note that there's no expression tree involved here - you're using IEnumerable<T> so it's just converting the lambda expression into a delegate.

See section 7.14.4 of the C# 3 spec for more information.

Jon Skeet 2009-06-09 12:21:42

What I mean to ask is if its almost always undesirable, why does it work that way? Wouldn't everyone prefer if it worked the way we obviously meant?

GWLlosa 2009-06-09 12:23:28

If it is almost always undesirable, why is it the language spec?

Svish 2009-06-09 12:24:57

Yes, I suspect they would... and I suspect the language designers would regard this decision as a mistake now. Whether they'd be willing to break any applications (if there *are* any) which rely on this behaviour is a different matter though.

Jon Skeet 2009-06-09 12:25:50

The hard part is coming up with a realistic example where the change would *break* code that isn't already horribly, horribly broken. I seem to recall coming up with *one* example, but this example was truly, truly awful, and frankly deserved to get broken.

Marc Gravell 2009-06-09 12:28:15

Absolutely - I would support such a change, but whether the team is happy to allow even that breakage of weird, almost certainly broken code is a different matter.

Jon Skeet 2009-06-09 12:36:21

Yes, we now realize that this scenario does produce undesirable behaviour and wish that we'd done it the other way. However, changing it now would be a breaking change, and yes, we try to not break even weird code. You'd be surprised at the amount of weird code that's in the wild.

Eric Lippert 2009-06-09 14:49:59

I don't suppose anyone has an example of what sort of code benefits from it? I haven't been able to figure one out.

GWLlosa 2009-06-09 18:05:58

It's not so much that there's a benefit as that this is simply the unfortunate interaction of two individually sensible rules: (1) closures close over variables, and (2) the loop variable in a foreach is logically a single variable for every iteration, not a fresh new variable for each run through the loop. Though (1) has clear benefits (allowing different anonymous functions to share state) it is often surprising to people. But (2) has no clearly observable benefits that I know of; it just seems conceptually cleaner to declare only one variable, as a "for" loop typically does.

Eric Lippert 2009-06-09 23:09:18

In fact, I'm not even sure I'd say it's conceptually cleaner to declare only one variable. The iteration variable is read-only within the body of the foreach, and yet it magically changes between iterations. To me, it feels simpler to imagine it as a read-only variable declared *inside* the body of the loop (just before the user's code). That also works better with how it reads: "for each variable of type Foo in ..." The fact that the variable declaration part comes after the "each" suggests (to me) that there's a new variable per iteration. I know there *isn't*, but it's always felt wrong.

Jon Skeet 2009-06-10 05:37:36

I see your point -- you're making a semantic argument. But the counter-argument is the syntactic argument that in "foreach(C c in cs){ ... }, the "C c" is lexically _outside_ the { }, and therefore should behave as though it was a declaration of a single variable outside the { }. "foreach" is a bit of a platypus -- a bit of a mishmash and therefore hard to reason about intuitively.

Eric Lippert 2009-06-10 15:14:40

Good point (surprise surprise) - I hadn't really thought of the braces. I suspect would be hard to come up with a clean syntax which was syntactically accurate, semantically useful (i.e. had the behaviour we want most of the time) and intuitive. But hey, that's why I don't get paid to design languages ;)

Jon Skeet 2009-06-10 15:55:21

Answer 3

+1 A:

I think what you really want to do here is something more like this:

IEnumerable<char> query = "Not what you might expect";
query = query.Except("aeiou");

foreach (char Output in query)
{
    System.Out.WriteLine(Output);
}

Joel Coehoorn 2009-06-09 13:10:28

In this case, that's true, in that it would work. But this was a simplified example with lots of extraneous business code removed. The real version had a lot more stuff, including several nested foreaches and unions of unions of sequences. The root problem was the same as the simplified example, so that's what I posted.

GWLlosa 2009-06-09 13:37:47

ansaurus

tags:

views:

answers:

Why doesn't deferred execution cache iterative values?

related questions