views:

17

answers:

2

Hi,

Im computing data from database (~130 000 000 rows).

Because of big amount of rows I select 1 mln compute them save results then select another 1 mln and so on.

I use select .. orderby .. skip m... take n...ToList()

because I want to have this objects in memory.

When I skip 1 mln then 2 mln then 3 mln ... then lets say 6 mln its quite ok but then suddenly qery takes very long.

Have you got the same problem ?

Is there any way I can make it work faster ?

Thanks for help, Bye

A: 

You could use Rowcount but I do not know hos this is applicable in entity framework.

That way you go for where Rowcount() > 2 000 000 then take(1 000 000)

Or if you have an ID column and are traversing in order add a condition to where for id > last processed ID.

That should be faster than skip

David Mårtensson
`Skip` already does rowcount. That's precisely the problem, here. Rowcount is slow on a table this big.
Craig Stuntz
A: 

If you look at the generated SQL, you will see the problem. SQL Server does not have a native SKIP, so the Entity Framework improvises around this. I've explained some of the details in this post.

To do this efficiently, you need to partition your data by a different method, one which can be implemented by the server using an index. Without knowing more about the problem, I can't say what the best approaches, but look for a means of partitioning your data which could be indexed in a SQL query.

Craig Stuntz
I want to create network where vertices are users and edges are connections beetwen (if someone made a call to another user then edge value is increased). In one row I vae UserAId UserBid and some other details. I need to precess each row to fill data and here is the problem with efficiency
gruber