tags:

views:

281

answers:

5

I did the following query:

var list = from book in books
          where book.price > 50
          select book;

list = list.Take(50);

I would expect the above to generate something like:

SELECT top 50 id, title, price, author
FROM Books
WHERE price > 50

but it generates:

SELECT
[Limit1].[C1] as [C1]
[Limit1].[id] as [Id], 
[Limit1].[title] as [title], 
[Limit1].[price] as [price], 
[Limit1].[author]
FROM (SELECT TOP (50) 
             [Extent1].[id] as as [Id], 
             [Extent1].[title] as [title], 
             [Extent1].[price] as [price], 
             [Extent1].[author] as [author]
      FROM Books as [Extent1]
      WHERE [Extent1].[price] > 50
     ) AS [Limit1]

Why does the above linq query generate a subquery and where does the C1 come from?

A: 

The subquery is generated for projection purposes, it makes more sense when you select from multiple tables into a single anonymous object, then the outer query is used to gather the results.

Try what happens with something like this:

from book in books
where price > 50
select new 
{
  Title = book.title,
  Chapters = from chapter in book.Chapters
             select chapter.Title
}
Sander Rijken
Yes, I can see it happening with multiple tables, but what do you mean by projection purposes?
Xaisoft
projections are things like: Sum, Max, Min and also Take()
Sander Rijken
ok, so if I took out the Take, it probably would not generate the subquery.
Xaisoft
+2  A: 

Disclaimer: I've never used LINQ before...

My guess would be paging support? I guess you have some sort of Take(50, 50) method that gets 50 records, starting at record 50. Take a look at the SQL that query generates and you will probably find that it uses a similar sub query structure to allow it to return any 50 rows in a query in approximately the amount of time that it returns the first 50 rows.

In any case, the nested sub query doesn't add any performance overhead as it's automagically optimised away during compilation of the execution plan.

Kragen
yeah, i checked the time out and it is virtually the same, I was just curious on why they do it this way when it can easily be achieved without the subquery.
Xaisoft
Just for the record, It will really be Take(50) and Skip(50), and it was intent for pagination, but i really think that in this case its a matter of how you build the expression.
Omar
Out of interest, what is the query that Linq executes if you skip say 100 rows?
Kragen
Together with the Take, like Take(50).Skip(100) or as another separate statement or without the Take
Xaisoft
A: 

Isn't it a case of the first query returning the total number of rows while the second extracts the subset of rows based on the call to the .Take() method?

daft
Slower than a turtle on a cold, cold winters day...
daft
ok, I see what you are saying. The first linq query generates the outer sql query and the list = list.Take(50) generates the subquery. Is that correct?
Xaisoft
Turtles actually run pretty fast, lol
Xaisoft
That's what I'm thinking, but I haven't tried it out so I'm not 100% certain.
daft
i've tried the exact same statement as shown above and i receive the desired SQL output..
Stan R.
+1  A: 

You could still make it cleaner like this:

var c = (from co in db.countries
                    where co.regionID == 5
                    select co).Take(50);

This will result in:

Table(country).Where(co => (co.regionID = Convert(5))).Take(50)

Equivalent to:

SELECT TOP (50) [t0].[countryID], [t0].[regionID], [t0].[countryName], [t0].[code]
FROM [dbo].[countries] AS [t0]
WHERE [t0].[regionID] = 5

EDIT: Comments, Its Not necessarily because with separate Take(), you can still use it like this:

var c = (from co in db.countries
                     where co.regionID == 5
                     select co);
            var l = c.Take(50).ToList();

And the Result would be the same as before.

SELECT TOP (50) [t0].[countryID], [t0].[regionID], [t0].[countryName], [t0].[code]
FROM [dbo].[countries] AS [t0]
WHERE [t0].[regionID] = @p0

The fact that you wrote IQueryable = IQueryable.Take(50) is the tricky part here.

Omar
Yeah, I have made it cleaner, but I was curious why it generates a subquery with the linq query I provided. Is it because I am doing the Take seperately?
Xaisoft
i think that this last line is making the difference here: "list = list.Take(50)", its projecting itself and then making the TAKE / TOP.
Omar
Update my post with another example, where you can use .take separately, and still get a clean Query.
Omar
A: 
  1. I agree with @Justin Swartsel. There was no error involved, so this is largely an academic matter.
  2. Linq-to-SQL endeavors to generate SQL that runs efficiently (which it did in your case).
    1. But it does not make any effort to generate conventional SQL that a human would likely create.
  3. The Linq-to-SQL implementers likely used the builder pattern to generate the SQL.
    1. If so, it would be easier to append a substring (or a subquery in this case) than it would be to backtrack and insert a 'TOP x' fragment into the SELECT clause.
Jim G.
So what you are saying is that the Linq designers where just lazy and it was easier for them to generate a subquery rather than do a top in the first query.
Xaisoft
@Xaisoft: If the implementers used the builder pattern and favored subquery-appends to 'TOP x'-inserts, they probably did so because the approach yielded robust code; not necessarily because it was easier.
Jim G.