views:

514

answers:

2

I'm reading the book "LINQ Pocket Reference" and there is a particular example (slightly modified below) that I'm having difficulty getting my head around... The explanation in the book is a bit brief, so I was wondering if someone could break it down step-by-step for me so that it makes sense...

    IEnumerable<char> query2 = "Not what you might expect";
    foreach (char vowel in "aeiou")
    {
        var t = vowel;
        query2 = query2.Where(c => c != t);
        // iterate through query and output (snipped for brevity)
    }

Outputs this:

    Not wht you might expect
    Not wht you might xpct
    Not wht you mght xpct
    Nt wht yu mght xpct
    Nt wht y mght xpct

Which makes perfect sense to me... However, this does not.

    IEnumerable<char> query2 = "Not what you might expect";
    foreach (char vowel in "aeiou")
    {
        query2 = query2.Where(c => c != vowel);
        // iterate through query and output (snipped for brevity)
    }
    Not wht you might expect
    Not what you might xpct
    Not what you mght expect
    Nt what yu might expect
    Not what yo might expect

which doesn't...

Can someone give me a better explanation of exactly what is going on here?

+5  A: 

What happens with the first example is that the value of vowel is captured into a local (to the scope of the for-loop) variable.

The where-clause for the query will then use that captured variable. Where-clauses like this uses an anonymous method/lambda method, which can capture local variables. What happens then is that it captures the current value of the variable.

In the second class, however, it doesn't capture the current value, only which variable to use, and thus since this variable changes, each time you execute the loop, you build a new Where-clause on top of the last one, but you kinda modify all the preceding ones as well since you change the variable.

So in the first example, you get this type of query:

IEnumerable<char> query2 = "Not what you might expect";
Char t1 = 'a'; query2 = query2.Where(c => c != t1);
Char t2 = 'e'; query2 = query2.Where(c => c != t2);
Char t3 = 'i'; query2 = query2.Where(c => c != t3);
Char t4 = 'o'; query2 = query2.Where(c => c != t4);
Char t5 = 'u'; query2 = query2.Where(c => c != t5);

In the second example, you get this:

IEnumerable<char> query2 = "Not what you might expect";
Char vowel = 'a'; query2 = query2.Where(c => c != vowel);
vowel = 'e'; query2 = query2.Where(c => c != vowel);
vowel = 'i'; query2 = query2.Where(c => c != vowel);
vowel = 'o'; query2 = query2.Where(c => c != vowel);
vowel = 'u'; query2 = query2.Where(c => c != vowel);

By the time you execute this second example, the value of vowel will be 'u', so only the u will be stripped out. You have, however, 5 loops over the same string to strip out the 'u', but only the first one will of course do it.

This capturing of variables is one of the things we all trip over when using anonymous methods/lambdas, and you can read more about it here: C# In Depth: The Beauty of Closures.

If you browse down that page to the text under Comparing capture strategies: complexity vs power, you'll find some examples of this behaviour.

Lasse V. Karlsen
Thanks... Clarified things nicely... I'd figured it out to some extent... (See my answer to my own question below) :)
Andrew Rollings
+1  A: 

Actually, with rereading it, it makes sense. Using the temp variable means that the temp itself is captured within the query... We are evaluating the loop five times, and therefore there are five instantiated temp variable references for each version of the query.

In the case without the temp variable, there is only the reference to the loop variable.

So five references versus one reference. That's why it produces the results as shown.

In the first case, once it's evaluated the loop totally, the query has used the five references to the temp variables, hence stripping out a, e, i, o and u respectively.

In the second case, it's doing the same thing... only all five references are to the same variable which obviously only contains one value.

Moral of the story: Think "reference" not "value".

So, does this make sense to anyone else now?

Andrew Rollings