tags:

views:

336

answers:

5

How can I return first 100 records using Linq?

I have a table with 40million records.

This code works, but it's slow, because will return all values before filter:

var values = (from e in dataContext.table_sample
              where e.x == 1
              select e)
             .Take(100);

Is there a way to return filtered? Like T-SQL TOP clause?

+10  A: 

No, that doesn't return all the values before filtering. The Take(100) will end up being part of the SQL sent up - quite possibly using TOP.

Of course, it makes more sense to do that when you've specified an orderby clause.

LINQ doesn't execute the query when it reaches the end of your query expression. It only sends up any SQL when either you call an aggregation operator (e.g. Count or Any) or you start iterating through the results. Even calling Take doesn't actually execute the query - you might want to put more filtering on it afterwards, for instance, which could end up being part of the query.

When you start iterating over the results (typically with foreach) - that's when the SQL will actually be sent to the database.

(I think your where clause is a bit broken, by the way. If you've got problems with your real code it would help to see code as close to reality as possible.)

Jon Skeet
Tks, the code was corrected.
Zanoni
Not quite it wasn't - I think you mean "where e.x == 1"
Jon Skeet
You're right, corrected.
Zanoni
+1  A: 

I don't think you are right about it returning all records before taking the top 100. I think Linq decides what the SQL string is going to be at the time the query is executed (aka Lazy Loading), and your database server will optimize it out.

Robert Harvey
+1  A: 

Have you compared standard SQL query with your linq query? Which one is faster and how significant is the difference?

I do agree with above comments that your linq query is generally correct, but...

  • in your 'where' clause should probably be x==1 not x=1 (comparison instead of assignment)
  • 'select e' will return all columns where you probably need only some of them - be more precise with select clause (type only required columns); 'select *' is a vaste of resources
  • make sure your database is well indexed and try to make use of indexed data

Anyway, 40milions records database is quite huge - do you need all that data all the time? Maybe some kind of partitioning can reduce it to the most commonly used records.

twk
+1 good advice!
Meta-Knight
A: 

I'm going to go out on a limb and guess that you don't have an index on the column used in your where clause. If that's the case then it's undoubtedly doing a table scan when the query is materialized and that's why it's taking so long.

tvanfosson
A: 

I agree with Jon Skeet, but just wanted to add:

  1. The generated SQL will use TOP to implement Take().

  2. If you're able to run SQL-Profiler and step through your code in debug mode, you will be able to see exactly what SQL is generated and when it gets executed. If you find the time to do this, you will learn a lot about what happens underneath.

  3. There is also a DataContext.Log property that you can assign a TextWriter to view the SQL generated, for example:

    dbContext.Log = Console.Out;

  4. Another option is to experiment with LINQPad. LINQPad allows you to connect to your datasource and easily try different LINQ expressions. In the results panel, you can switch to see the SQL generated the LINQ expression.

Chris Melinn