views:

96

answers:

2

We have some search functionality that can return tens of thousands of results from the db although it will only fetch the rows needed to be displayed into e.g. first 10 records. When the next page is requested, we hit the db again. It searches our database based on a set of variables and this search can then be refined which will result in another database hit. The query is fairly complex.

We've been looking at different ways of doing this that fit with our overall architecture.

The first way is to use a stored procedure, probably populating a list of entities. This stored proc could quickly become large and unwieldly but will have better performance.

The second way is to use Linq to Entites or Entity SQL with Entity Framework 4.0 and create the query in code across our conceptual layer and would populate POCO objects via IQueryable. This has the advantages to us of:

  • Abstraction: We're using EF in other places in the application so it we would like to search on the abstracted model if possible.
  • Type safety and we can chain the filters on IQueryable to cleanly do what we want to do in an object oirentated way

Our main concern with this approach is the performance. We hope to utilise Parrlel LINQ to Entities and are able to throw more hardware at it if needed. A small performance hit is OK for a cleaner development pattern.

We would appreciate hearing people's thought and recommendations on this.... We're new to a lot of these techs so would like to hear peoples experiences.

A: 

You said the ultimate goal is performance. That would mean ADO.NET and straight SQL to me. Adding EF on top of it is a huge amount of overhead for something that needs no state tracking, no update ability, and won't even use all the results.

Write SQL against the database and let it do the paging as much as possible. Never pull 1,000's of entries when you plan to throw them out. You also can't take advantage of your servers power with EF for things like FTS, or index hint optimization. You are at the mercy of the EF runtime, which is generic and does not know how to take advantage of specific hardware or servers.

You should also look at some caching layer you you know the user is going to query the next set some percentage of the time. It is cheaper to get 2x the initial results and cache the second half for when they call back. Otherwise you expire them at some point.

Jason Short
Yeah, we're definetly going to return a set of the results, not the whole lot, and implement a caching layer.We're going to do some performance tests to compare results. We'll be using POCO in EF to make EF's footprint is as small as possible - though I appreciate this could still be quite big. Of course will be slower, but good to see how much!
Steve Ward
+1  A: 

I've done some performance tests and using a Stored Procedure in EF4.0 populating an entity or a complex type is almost identical in performance to a SP accessed via ADO.NET so we're going to try this method. Using EF's built in querying was about twice as slow so we're going to use SPs in this performance critical situation.

Steve Ward