views:

1042

answers:

3

I have a LINQ to SQL query:

from at in Context.Transaction
select new  {
    at.Amount,
    at.PostingDate,
    Details = 
     from tb in at.TransactionDetail
     select new {
      Amount = tb.Amount,
      Description = tb.Desc
     }
}

This results in one SQL statement being executed. All is good.

However, if I attempt to return known types from this query, even if they have the same structure as the anonymous types, I get one SQL statement executed for the top level and then an additional SQL statement for each "child" set.

Is there any way to get LINQ to SQL to issue one SQL statement and use known types?

EDIT: I must have another issue. When I plugged a very simplistic (but still hieararchical) version of my query into LINQPad and used freshly created known types with just 2 or 3 members, I did get one SQL statement. I will post and update when I know more.

EDIT 2: This appears to be due to a bug in Take. See my answer below for details.

A: 

I've not had a chance to try this but given that the anonymous type isn't part of LINQ rather a C# construct I wonder if you could use:

from at in Context.Transaction
select new KnownType(
    at.Amount,
    at.PostingDate,
    Details = 
        from tb in at.TransactionDetail
        select KnownSubType(
                Amount = tb.Amount,
                Description = tb.Desc
        )
}

Obviously Details would need to be an IEnumerable collection.

I could be miles wide on this but it might at least give you a new line of thought to pursue which can't hurt so please excuse my rambling.

Lazarus
+1  A: 

I've now determined this is the result of a horrible bug. The anonymous versus known type turned out not to be the cause. The real cause is Take.

The following result in 1 SQL statement:

query.Skip(1).Take(10).ToList();
query.ToList();

However, the following exhibit the one sql statement per parent row problem.

query.Skip(0).Take(10).ToList();
query.Take(10).ToList();

Can anyone think of any simple workarounds for this?

EDIT: The only workaround I've come up with is to check to see if I'm on the first page (IE Skip(0)) and then make two calls, one with Take(1) and the other with Skip(1).Take(pageSize - 1) and addRange the lists together.

JohnOpincar
+4  A: 

First - some reasoning for the Take bug.

If you just Take, the query translator just uses top. Top10 will not give the right answer if cardinality is broken by joining in a child collection. So the query translator doesn't join in the child collection (instead it requeries for the children).

If you Skip and Take, then the query translator kicks in with some RowNumber logic over the parent rows... these rownumbers let it take 10 parents, even if that's really 50 records due to each parent having 5 children.

If you Skip(0) and Take, Skip is removed as a non-operation by the translator - it's just like you never said Skip.

This is going to be a hard conceptual leap to from where you are (calling Skip and Take) to a "simple workaround". What we need to do - is force the translation to occur at a point where the translator can't remove Skip(0) as a non-operation. We need to call Skip, and supply the skipped number at a later point.

DataClasses1DataContext myDC = new DataClasses1DataContext();
  //setting up log so we can see what's going on
myDC.Log = Console.Out;

  //hierarchical query - not important
var query = myDC.Options.Select(option => new{
  ID = option.ParentID,
  Others = myDC.Options.Select(option2 => new{
    ID = option2.ParentID
  })
});
  //request translation of the query!  Important!
var compQuery = System.Data.Linq.CompiledQuery
  .Compile<DataClasses1DataContext, int, int, System.Collections.IEnumerable>
  ( (dc, skip, take) => query.Skip(skip).Take(take) );

  //now run the query and specify that 0 rows are to be skipped.
compQuery.Invoke(myDC, 0, 10);

This produces the following query:

SELECT [t1].[ParentID], [t2].[ParentID] AS [ParentID2], (
    SELECT COUNT(*)
    FROM [dbo].[Option] AS [t3]
    ) AS [value]
FROM (
    SELECT ROW_NUMBER() OVER (ORDER BY [t0].[ID]) AS [ROW_NUMBER], [t0].[ParentID]
    FROM [dbo].[Option] AS [t0]
    ) AS [t1]
LEFT OUTER JOIN [dbo].[Option] AS [t2] ON 1=1 
WHERE [t1].[ROW_NUMBER] BETWEEN @p0 + 1 AND @p1 + @p2
ORDER BY [t1].[ROW_NUMBER], [t2].[ID]
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [0]
-- @p1: Input Int (Size = 0; Prec = 0; Scale = 0) [0]
-- @p2: Input Int (Size = 0; Prec = 0; Scale = 0) [10]
-- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.1

And here's where we win!

WHERE [t1].[ROW_NUMBER] BETWEEN @p0 + 1 AND @p1 + @p2
David B
This works but I had to use IEnumerable<Transaction> instead of just IEnumerable in order to call ToList. It's also half a second faster in my test case. Nice answer.
JohnOpincar