views:

38

answers:

1

There's two options for dealing with LINQ to Collections that are populated with SQL (I'm using an Oracle provider, so no LINQ without an ORM).

1) Do one big SQL query, dump the results into some sort of collection and do LINQ queries on the collection, so you have one big draw on the database, but not much slowdown after that.

2) Do small SQL queries and dump the results into many smaller collections and do LINQ queries on those, so you have smaller draws on the database, but more consistent slowdowns throughout the application.

Anyone have any thoughts on this?

+3  A: 

There is one big difference between these two methods: if you query the database each time you need the data you will get the most recent data, whereas if you read in all the data in one go and reuse it then you will not see the newest changes until you read everything in again. Both systems have advantages and disadvantages, but you need to be aware of this difference.

Regarding performance differences - don't assume that LINQ to Objects on local data will always be faster than a database query. Databases are incredibly well optimized for different types of queries and can take advantage of indexes. LINQ to Object queries generally just iterate over the entire data set. So even if you have the data locally unless you make an effort to index the data yourself some queries might actually be slower than if you get the database to do the work.

Even for queries where indexes can't be used databases can still beat a naive LINQ to Objects approach. Database have some very advanced algorithms that aren't implemented in LINQ to Objects. For example, a common query is to fetch the top 100 items sorted by some criteria with a filter. Even without a usable index for a sufficient large result set the database might still outperform LINQ to Objects because OrderBy(x => x.Foo).Take(100) will first perform an O(n log n) sort and then afterwards take the first hundred elements and discard the rest. The SQL Server team knows that this type of query is common and so they added a special optimization call TOP N SORT which can perform this operation in O(n) time. I imagine that Oracle has a similar optimization. I have written another answer that goes into more details on this point, including some performance measurements of a LINQ to SQL versus a LINQ to Objects query.

Mark Byers
Fair enough, but is it really a good idea to assume the database is optimized this way?
Riddari