views:

154

answers:

2

After reading "Odd query expressions" by Jon Skeet, I tried the code below. I expected the LINQ query at the end to translate to int query = proxy.Where(x => x).Select(x => x); which does not compile because Where returns an int. The code compiled and prints "Where(x => x)" to the screen and query is set to 2. Select is never called, but it needs to be there for the code to compile. What is happening?

using System;
using System.Linq.Expressions;

public class LinqProxy
{
    public Func<Expression<Func<string,string>>,int> Select { get; set; }
    public Func<Expression<Func<string,string>>,int> Where { get; set; }
}

class Test
{
    static void Main()
    {
        LinqProxy proxy = new LinqProxy();

        proxy.Select = exp => 
        { 
            Console.WriteLine("Select({0})", exp);
            return 1;
        };
        proxy.Where = exp => 
        { 
            Console.WriteLine("Where({0})", exp);
            return 2;
        };

        int query = from x in proxy
                    where x
                    select x;
    }
}
+8  A: 

It's because your "select x" is effectively a no-op - the compiler doesn't bother putting the Select(x => x) call at the end. It would if you removed the where clause though. Your current query is known as a degenerate query expression. See section 7.16.2.3 of the C# 4 spec for more details. In particular:

A degenerate query expression is one that trivially selects the elements of the source. A later phase of the translation removes degenerate queries introduced by other translation steps by replacing them with their source. It is important however to ensure that the result of a query expression is never the source object itself, as that would reveal the type and identity of the source to the client of the query. Therefore this step protects degenerate queries written directly in source code by explicitly calling Select on the source. It is then up to the implementers of Select and other query operators to ensure that these methods never return the source object itself.

So, three translations (regardless of data source)

// Query                          // Translation
from x in proxy                   proxy.Where(x => x)
where x
select x


from x in proxy                   proxy.Select(x => x)
select x               


from x in proxy                   proxy.Where(x => x)
where x                                .Select(x => x * 2)
select x * 2
Jon Skeet
Cool, didn't know this.
gaearon
Thanks. I figured it was something like that, but I wasn't sure exactly why it was being ignored.
mcrumley
+6  A: 

It compiles because the LINQ query syntax is a lexical substitution. The compiler turns

int query = from x in proxy
            where x
            select x;

into

int query = proxy.Where(x => x);     // note it optimises the select away

and only then does it check whether the methods Where and Select actually exist on the type of proxy. Accordingly, in the specific example you gave, Select does not actually need to exist for this to compile.

If you had something like this:

    int query = from x in proxy
                select x.ToString();

then it would get changed into:

int query = proxy.Select(x => x.ToString());

and the Select method would be called.

Timwi
Well, the point is that it doesn't check Select at all. It was the degenerate query part which was confusing the OP, I suspect.
Jon Skeet
@Jon: Expanded.
Timwi