why couldn't the designers of Linq make things simple (more sql like)
They could have. But your definition of simple (as a sql programmer) is not the same as the OO programmer's definition of simple. Linq (in C#) is a query technology for OO programmers, first. An example of this is why select comes last. That's to fulfill scope rules in C# and intellisense support in the editor.
These programmers maybe don't get LEFT JOIN
(and get really confused if you say LEFT OUTER JOIN
- thinking there's some difference, like one inherits from the other).
What they do understand is GROUP JOIN
, which behaves in a similar manner.
List<int> myInts = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
List<int> myOtherInts = new List<int>() { 1, 3, 5, 7, 9, 11, 13 };
//
var query = from i in myInts
join j in myOtherInts on i equals j into g
select new {key = i, myGroup = g};
//
foreach (var grouping in query)
{
Console.WriteLine("--{0}", grouping.key);
foreach (var x in grouping.myGroup)
Console.WriteLine(x);
}
All that DefaultIfEmpty
stuff does is unpack the group - flattening the results into row/column form - away from the OO programmer's natural heirarchical form. DefaultIfEmpty
is not semantically necessary to get to the results.
Here's the same query in method form - which the compiler generates from the above and which I prefer:
var query = myInts.GroupJoin(
myOtherInts,
i => i,
j => j,
(i, g) => new { key = i, myGroup = g }
);
Could you state that in terms of his example?
This query gives you customers, with their orders as an attached collection. The orders collection might be empty. If you have 50 customers and 1000 orders, you will have 50 customers in the result.
from c in dc.Customers
join o in dc.Orders on c.custid equals o.custid into someOrders
select new CustomerWithOrders()
{theCustomer = c, theOrders = someOrders};
This query gives you a CustomerOrder row. If a customer has 5 orders, the customer will appear 5 times, each time matched to a different order. If the customer has 0 orders, the customer will appear once matched to a null order. If you have 50 customers and 1000 orders, you will have 50-1049 rows after the join and the meaning of an element in the result is hard to define.
from c in dc.Customers
join o in dc.Orders on c.custid equals o.custid into temp
from x in temp.DefaultIfEmpty()
select new CustomerOrderHybrid() {theCustomer = c, theOrder = x}
If they implemented left join
, it would require the result shape of the second example. Once I used the group join
, which is better, I wouldn't implement the left join
in one step either. Hierarchical shaping of query results is great.