ansaurus

Question

Answer 1

A:

Probably because Linq expressions are simply syntactic sugar in the compiler, which translates them into method calls. So the query syntax is a leaky abstraction of an object-oriented system.

Since you're not actually writing SQL, there are bound to be cases where the underlying technology behaves differently. Adding a SQL-like 'left join' is probably a lot harder than you think.

Some people evidently don't know how Linq expressions work, so here's a further explanation.

If I take this test class:

public class Class1
{
    public List<string> list = new List<string>() { "test", "test1", "test2" };

    public void test_lambda()
    {
        var test = list.Where(l => l == "test1");
    }

    public void test_linq()
    {
        var test = from l in list
                   where l == "test2"
                   select l;
    }
}

list.Where(l => l == "test2") is compiled to the same code as from l in list where l == "test2" select l. In both cases the compiler generates anonymous method delegates:

.method public hidebysig instance void test_lambda() cil managed
{
    .maxstack 8
    L_0000: ldarg.0 
    L_0001: ldfld class [mscorlib]System.Collections.Generic.List`1<string> Class1::list
    L_0006: ldsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate1
    L_000b: brtrue.s L_001e
    L_000d: ldnull 
    L_000e: ldftn bool Class1::<test_lambda>b__0(string)
    L_0014: newobj instance void [System.Core]System.Func`2<string, bool>::.ctor(object, native int)
    L_0019: stsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate1
    L_001e: ldsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate1
    L_0023: call class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0> [System.Core]System.Linq.Enumerable::Where<string>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>, class [System.Core]System.Func`2<!!0, bool>)
    L_0028: pop 
    L_0029: ret 
}

.method public hidebysig instance void test_linq() cil managed
{
    .maxstack 8
    L_0000: ldarg.0 
    L_0001: ldfld class [mscorlib]System.Collections.Generic.List`1<string> Class1::list
    L_0006: ldsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate3
    L_000b: brtrue.s L_001e
    L_000d: ldnull 
    L_000e: ldftn bool Class1::<test_linq>b__2(string)
    L_0014: newobj instance void [System.Core]System.Func`2<string, bool>::.ctor(object, native int)
    L_0019: stsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate3
    L_001e: ldsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate3
    L_0023: call class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0> [System.Core]System.Linq.Enumerable::Where<string>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>, class [System.Core]System.Func`2<!!0, bool>)
    L_0028: pop 
    L_0029: ret 
}

This is what I mean by syntactic sugar. Query expressions don't add anything new to the language, they just provide an easier way of using existing language features.

Adam Lassek 2009-02-11 14:41:55

Yeah I guessed it was probably hard - I'd like to know why it was hard though...

2009-02-11 14:51:42

Linq is not syntactic sugar. the compiler doesn't translate anything INTO lambda expressions. No one said anything about generating SQL in the question. -1

David B 2009-02-11 17:10:13

OK, I'm not positive it's lambdas per se, but linq IS syntactic sugar, and gets translated into method calls by the compiler.

Adam Lassek 2009-02-11 17:36:21

Linq is more than query expressions. I'll withdraw my -1.

David B 2009-02-11 20:21:10

I think there was confusion due to my original use of 'linq' without qualification. I've clarified that, since linq refers to many, many different things.

Adam Lassek 2009-02-11 20:27:39

The question is about Linq to SQL, that is a query of type IQueryable<T> instead of IEnumerable<T> as in your example. In Linq to SQL, the compiler will not translate to method calls, but instead generates an expression tree that a library can translate to anything it chooses (like SQL strings).

Tom Lokhorst 2009-02-11 21:58:57

To be more precise, the choose for generating either plain method calls or an expression tree, is based on the type of the expression. That is, whether the type is `Func<T, R>` or `Expression<Func<T, R>>`.

Tom Lokhorst 2009-02-11 22:06:47

It just so happens to be that `IEnumerable<T>` implements methods with a `Func<T, R>` type and `IQueryable<T>` methods with type `Expression<Func<T, R>>`.

Tom Lokhorst 2009-02-11 22:07:57

Were I querying a DataContext in my example, both expressions would have been compiled into Expression Trees instead. My point was that Query Expressions are syntactic sugar that is converted into more traditional objects behind the scenes. I am aware that it is more complicated than I let on.

Adam Lassek 2009-02-12 06:51:45

Answer 2

+5 A:

why couldn't the designers of Linq make things simple (more sql like)

They could have. But your definition of simple (as a sql programmer) is not the same as the OO programmer's definition of simple. Linq (in C#) is a query technology for OO programmers, first. An example of this is why select comes last. That's to fulfill scope rules in C# and intellisense support in the editor.

These programmers maybe don't get LEFT JOIN (and get really confused if you say LEFT OUTER JOIN - thinking there's some difference, like one inherits from the other).

What they do understand is GROUP JOIN, which behaves in a similar manner.

List<int> myInts = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
List<int> myOtherInts = new List<int>() { 1, 3, 5, 7, 9, 11, 13 };
//
var query = from i in myInts
    join j in myOtherInts on i equals j into g
    select new {key = i, myGroup = g};
//
foreach (var grouping in query)
{
  Console.WriteLine("--{0}", grouping.key);
  foreach (var x in grouping.myGroup)
    Console.WriteLine(x);
}

All that DefaultIfEmpty stuff does is unpack the group - flattening the results into row/column form - away from the OO programmer's natural heirarchical form. DefaultIfEmpty is not semantically necessary to get to the results.

Here's the same query in method form - which the compiler generates from the above and which I prefer:

var query = myInts.GroupJoin(
    myOtherInts,
    i => i,
    j => j,
    (i, g) => new { key = i, myGroup = g }
);

Could you state that in terms of his example?

This query gives you customers, with their orders as an attached collection. The orders collection might be empty. If you have 50 customers and 1000 orders, you will have 50 customers in the result.

from c in dc.Customers
join o in dc.Orders on c.custid equals o.custid into someOrders
select new CustomerWithOrders()
  {theCustomer = c, theOrders = someOrders};

This query gives you a CustomerOrder row. If a customer has 5 orders, the customer will appear 5 times, each time matched to a different order. If the customer has 0 orders, the customer will appear once matched to a null order. If you have 50 customers and 1000 orders, you will have 50-1049 rows after the join and the meaning of an element in the result is hard to define.

from c in dc.Customers
join o in dc.Orders on c.custid equals o.custid into temp
from x in temp.DefaultIfEmpty()
select new CustomerOrderHybrid() {theCustomer = c, theOrder = x}

If they implemented left join, it would require the result shape of the second example. Once I used the group join, which is better, I wouldn't implement the left join in one step either. Hierarchical shaping of query results is great.

David B 2009-02-11 17:06:54

I don't see how your answer addresses the question. Could you state that in terms of his example?

le dorfier 2009-02-11 19:05:16

I'd first life to say thanks for taking the time to wrote your your thought provoking post. I'll have a closer look tomorrow - my immediate response however is to say that there is a clear Sqlish slant to Linq already so to say that OO programmers wouldn't get it doesn't really wash as a reason.

2009-02-11 21:48:09

Ouch my brain hurts... but I think I understand about the LINQ designers trying to preserve heirachical representations in preference to flattened ones.

stusmith 2009-02-12 10:20:18

ansaurus

tags:

views:

answers:

Why does Linq do joins differently

related questions