views:

61

answers:

2

I have the following group by linq statement

from c in Categories
join p in Products on c equals p.Category into ps
select new { Category = new {c.CategoryID, c.CategoryName}, Products = ps };

However this generates the following left outer join query and returns all categories even if there are no products associated.

SELECT [t0].[CategoryID], [t0].[CategoryName], [t1].[ProductID], [t1].[ProductName],     [t1].[SupplierID], [t1].[CategoryID] AS [CategoryID2], [t1].[QuantityPerUnit],   [t1].[UnitPrice], [t1].[UnitsInStock], [t1].[UnitsOnOrder], [t1].[ReorderLevel], [t1].[Discontinued], (
    SELECT COUNT(*)
    FROM [Products] AS [t2]
    WHERE [t0].[CategoryID] = [t2].[CategoryID]
    ) AS [value]
FROM [Categories] AS [t0]
LEFT OUTER JOIN [Products] AS [t1] ON [t0].[CategoryID] = [t1].[CategoryID]
ORDER BY [t0].[CategoryID], [t1].[ProductID]

What I really want is to return only those categories that have associated products. But if I re-write the linq query like so:

from c in Categories
join p in Products on c equals p.Category
group p by new {c.CategoryID, c.CategoryName} into ps
select new { Category = ps.Key, Products = ps };

This gives me the desired result but a query is generated for each category:

SELECT [t0].[CategoryID], [t0].[CategoryName]
FROM [Categories] AS [t0]
INNER JOIN [Products] AS [t1] ON [t0].[CategoryID] = [t1].[CategoryID]
GROUP BY [t0].[CategoryID], [t0].[CategoryName]
GO

-- Region Parameters
DECLARE @x1 Int SET @x1 = 1
DECLARE @x2 NVarChar(9) SET @x2 = 'Beverages'
-- EndRegion
SELECT [t1].[ProductID], [t1].[ProductName], [t1].[SupplierID], [t1].[CategoryID],  [t1].[QuantityPerUnit], [t1].[UnitPrice], [t1].[UnitsInStock], [t1].[UnitsOnOrder], [t1].[ReorderLevel], [t1].[Discontinued]
FROM [Categories] AS [t0]
INNER JOIN [Products] AS [t1] ON [t0].[CategoryID] = [t1].[CategoryID]
WHERE (@x1 = [t0].[CategoryID]) AND (@x2 = [t0].[CategoryName])
GO

-- Region Parameters
DECLARE @x1 Int SET @x1 = 2
DECLARE @x2 NVarChar(10) SET @x2 = 'Condiments'
-- EndRegion
SELECT [t1].[ProductID], [t1].[ProductName], [t1].[SupplierID], [t1].[CategoryID], [t1].[QuantityPerUnit], [t1].[UnitPrice], [t1].[UnitsInStock], [t1].[UnitsOnOrder], [t1].[ReorderLevel], [t1].[Discontinued]
FROM [Categories] AS [t0]
INNER JOIN [Products] AS [t1] ON [t0].[CategoryID] = [t1].[CategoryID]
WHERE (@x1 = [t0].[CategoryID]) AND (@x2 = [t0].[CategoryName])
GO

...

Is there a way to do the equivalent of a inner join and group by and still only produce a single query like the group join?

+1  A: 

What is the purpose of that join?

Your original query is identical to this:

from c in Categories
select new { Category = new { c.CategoryID, c.CategoryName }, c.Products }

Am I somehow missing something obvious???

If you want only categories with products, then do this:

from c in Categories
where c.Products.Any()
select new { Category = new { c.CategoryID, c.CategoryName }, c.Products }

Or, if you want to flatten the results:

from p in Products
select new { p, p.Category.CategoryID, p.Category.CategoryName }

The latter will translate into an inner or outer join - depending on whether that relationship is nullable. You can force the equivalent of an inner join as follows:

from p in Products
where p.Category != null
select new { p, p.Category.CategoryID, p.Category.CategoryName }
Joe Albahari
You are missing something but I guess it's not that obvious. I need to do a join because the actual tables that I am running the query on do not have a FK relationship. I only used the Northwind tables as examples. I need a group join, I do not want to flatten the list. Hence the title of the question. In other words, I would like the results to return a category and it's list of products, but I don't want the categories with no products to be included. But the point wasn't really, whether I could run the query without group but how I can do a group by that produces a single inner join query.
dannie.f
Your second query was the most useful though, I can filter the group with the ps.Any() and that works to remove the categories that have no products but It is still an outer join. I guess I am curious to know why I can't produce an inner join that works without producing so many sql queries.
dannie.f
+1  A: 
var queryYouWant =
  from c in Categories 
  join p in Products on c equals p.Category
  select new {Category = c, Product = p};

var result =
  from x in queryYouWant.AsEnumerable()
  group x.Product by x.Category into g
  select new { Category = g.Key, Products = g }; 

Is there a way to do the equivalent of a inner join and group by and still only produce a single query like the group join?

No. When you say GroupBy followed by non-aggregated access of the group elements, that's a repeated query with the group key as a filter.

David B
Thanks. I was thinking there must be a way to get the results I wanted without first flattening the list then grouping, but guess I would have to do it in 2 steps.
dannie.f