tags:

views:

297

answers:

2

OK this is a little moan but it's also a question. In Linq I can do a join like this:

from c in dc.Customers join o in dc.Orders on c.custid equals o.custid ...

All well and good and completely rememberable without having to go back and google it. However left joins are far more complicated for some reason:

from c in dc.Customers 
join o in dc.Orders on c.custid equals o.custid 
into temp from x in temp.DefaultIfEmpty() ...

So my question is why couldn't the designers of Linq make things simple (more sql like) with something like this:

from c in dc.Customers left join o in dc.Orders on c.custid equals o.custid ...

Cheers Lee

A: 

Probably because Linq expressions are simply syntactic sugar in the compiler, which translates them into method calls. So the query syntax is a leaky abstraction of an object-oriented system.

Since you're not actually writing SQL, there are bound to be cases where the underlying technology behaves differently. Adding a SQL-like 'left join' is probably a lot harder than you think.


Some people evidently don't know how Linq expressions work, so here's a further explanation.

If I take this test class:

public class Class1
{
    public List<string> list = new List<string>() { "test", "test1", "test2" };

    public void test_lambda()
    {
        var test = list.Where(l => l == "test1");
    }

    public void test_linq()
    {
        var test = from l in list
                   where l == "test2"
                   select l;
    }
}

list.Where(l => l == "test2") is compiled to the same code as from l in list where l == "test2" select l. In both cases the compiler generates anonymous method delegates:

.method public hidebysig instance void test_lambda() cil managed
{
    .maxstack 8
    L_0000: ldarg.0 
    L_0001: ldfld class [mscorlib]System.Collections.Generic.List`1<string> Class1::list
    L_0006: ldsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate1
    L_000b: brtrue.s L_001e
    L_000d: ldnull 
    L_000e: ldftn bool Class1::<test_lambda>b__0(string)
    L_0014: newobj instance void [System.Core]System.Func`2<string, bool>::.ctor(object, native int)
    L_0019: stsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate1
    L_001e: ldsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate1
    L_0023: call class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0> [System.Core]System.Linq.Enumerable::Where<string>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>, class [System.Core]System.Func`2<!!0, bool>)
    L_0028: pop 
    L_0029: ret 
}

.method public hidebysig instance void test_linq() cil managed
{
    .maxstack 8
    L_0000: ldarg.0 
    L_0001: ldfld class [mscorlib]System.Collections.Generic.List`1<string> Class1::list
    L_0006: ldsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate3
    L_000b: brtrue.s L_001e
    L_000d: ldnull 
    L_000e: ldftn bool Class1::<test_linq>b__2(string)
    L_0014: newobj instance void [System.Core]System.Func`2<string, bool>::.ctor(object, native int)
    L_0019: stsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate3
    L_001e: ldsfld class [System.Core]System.Func`2<string, bool> Class1::CS$<>9__CachedAnonymousMethodDelegate3
    L_0023: call class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0> [System.Core]System.Linq.Enumerable::Where<string>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>, class [System.Core]System.Func`2<!!0, bool>)
    L_0028: pop 
    L_0029: ret 
}

This is what I mean by syntactic sugar. Query expressions don't add anything new to the language, they just provide an easier way of using existing language features.

Adam Lassek
Yeah I guessed it was probably hard - I'd like to know why it was hard though...
Linq is not syntactic sugar. the compiler doesn't translate anything INTO lambda expressions. No one said anything about generating SQL in the question. -1
David B
OK, I'm not positive it's lambdas per se, but linq IS syntactic sugar, and gets translated into method calls by the compiler.
Adam Lassek
Linq is more than query expressions. I'll withdraw my -1.
David B
I think there was confusion due to my original use of 'linq' without qualification. I've clarified that, since linq refers to many, many different things.
Adam Lassek
The question is about Linq to SQL, that is a query of type IQueryable<T> instead of IEnumerable<T> as in your example. In Linq to SQL, the compiler will not translate to method calls, but instead generates an expression tree that a library can translate to anything it chooses (like SQL strings).
Tom Lokhorst
To be more precise, the choose for generating either plain method calls or an expression tree, is based on the type of the expression. That is, whether the type is `Func<T, R>` or `Expression<Func<T, R>>`.
Tom Lokhorst
It just so happens to be that `IEnumerable<T>` implements methods with a `Func<T, R>` type and `IQueryable<T>` methods with type `Expression<Func<T, R>>`.
Tom Lokhorst
Were I querying a DataContext in my example, both expressions would have been compiled into Expression Trees instead. My point was that Query Expressions are syntactic sugar that is converted into more traditional objects behind the scenes. I am aware that it is more complicated than I let on.
Adam Lassek
+5  A: 

why couldn't the designers of Linq make things simple (more sql like)

They could have. But your definition of simple (as a sql programmer) is not the same as the OO programmer's definition of simple. Linq (in C#) is a query technology for OO programmers, first. An example of this is why select comes last. That's to fulfill scope rules in C# and intellisense support in the editor.

These programmers maybe don't get LEFT JOIN (and get really confused if you say LEFT OUTER JOIN - thinking there's some difference, like one inherits from the other).

What they do understand is GROUP JOIN, which behaves in a similar manner.

List<int> myInts = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
List<int> myOtherInts = new List<int>() { 1, 3, 5, 7, 9, 11, 13 };
//
var query = from i in myInts
    join j in myOtherInts on i equals j into g
    select new {key = i, myGroup = g};
//
foreach (var grouping in query)
{
  Console.WriteLine("--{0}", grouping.key);
  foreach (var x in grouping.myGroup)
    Console.WriteLine(x);
}

All that DefaultIfEmpty stuff does is unpack the group - flattening the results into row/column form - away from the OO programmer's natural heirarchical form. DefaultIfEmpty is not semantically necessary to get to the results.

Here's the same query in method form - which the compiler generates from the above and which I prefer:

var query = myInts.GroupJoin(
    myOtherInts,
    i => i,
    j => j,
    (i, g) => new { key = i, myGroup = g }
);


Could you state that in terms of his example?

This query gives you customers, with their orders as an attached collection. The orders collection might be empty. If you have 50 customers and 1000 orders, you will have 50 customers in the result.

from c in dc.Customers
join o in dc.Orders on c.custid equals o.custid into someOrders
select new CustomerWithOrders()
  {theCustomer = c, theOrders = someOrders};

This query gives you a CustomerOrder row. If a customer has 5 orders, the customer will appear 5 times, each time matched to a different order. If the customer has 0 orders, the customer will appear once matched to a null order. If you have 50 customers and 1000 orders, you will have 50-1049 rows after the join and the meaning of an element in the result is hard to define.

from c in dc.Customers
join o in dc.Orders on c.custid equals o.custid into temp
from x in temp.DefaultIfEmpty()
select new CustomerOrderHybrid() {theCustomer = c, theOrder = x}

If they implemented left join, it would require the result shape of the second example. Once I used the group join, which is better, I wouldn't implement the left join in one step either. Hierarchical shaping of query results is great.

David B
I don't see how your answer addresses the question. Could you state that in terms of his example?
le dorfier
I'd first life to say thanks for taking the time to wrote your your thought provoking post. I'll have a closer look tomorrow - my immediate response however is to say that there is a clear Sqlish slant to Linq already so to say that OO programmers wouldn't get it doesn't really wash as a reason.
Ouch my brain hurts... but I think I understand about the LINQ designers trying to preserve heirachical representations in preference to flattened ones.
stusmith