views:

547

answers:

2

Stack Overflowers:

I have a search function on my company's website (based on .NET 2.0) that allows you to narrow the product catalog using up to 9 different fields. Right now, after you make your selections on the frontend I am building a dynamic query and hitting the database (SQL Server) to get the resulting list of items numbers.

I would like to move away from hitting the database everytime and do all of this in memory for faster results. Basically a 3500 - 4500 row "table" with 10 columns: the item number (which could be a primary key) and the 9 attribute fields (which have repeating values for many many rows). There can be any number of different searches between the 9 columns to get the items you want:

  • Column A = 'foo' AND Column D = 'bar'
  • Column B = 'foo' AND Column C = 'bar' AND Column I = 'me'
  • Column H = 'foo'
  • etc...

Based on my research, the .Select() function seems like the slowest way to perform the search, but it stands out to me as being the quickest and easiest way to perform the narrowing searches to get the list of item numbers:

MyDataSet.Select("Column B = 'foo' AND Column E = 'bar' AND Column I = 'me'")

In my specific case, what method do you suggest I use as an alternative that has the same narrowing functionality and better performance instead of settling for the datatable.select() method?

+1  A: 

Datatables are not optimally built for being queried, I wouldn't recommend going down this route, unless you really have a documented performance problem that you're certain would be improved by doing so.

If your dynamic queries are slow, it's probably because you haven't indexed your table properly in your database. Databases are designed to be able to optimally query your data, so my hunch would be that a little work on the database side of things should get you where you need to go.

If you really need to query ADO.Net datatables, make sure you read Scaling ADO.Net DataTables thoroughly. It talks about things you can do to speed up the performance of them, and gives you some benchmarks so you can see the difference.

womp
Sorry, I should have specified that the dynamic queries I currently run are quite fast as everything is indexed correctly, I just want the app to run even faster by not going over the wire at all if it I don't have to.
NinjaBomb
@NinjaBomb: you'll still be going over the wire (between client and server) with this approach. All you'd avoid is the trip between the server and the database (which should not be a bottleneck for you anyway), and it's at the price of a big chunk of server memory.
MusiGenesis
What do the experts think about caching the result sets that I get from the database in case someone runs the exact same search again within a specified amount of time?
NinjaBomb
If you expect certain queries to be duplicated often, then you should cache the results for sure. With 9 fields, you'll have a lot of permutations of possible queries, but you could still track down popular ones to see how to vary your caching.
womp
@NinjaBomb: it's the "in case" part that's the problem - by caching DataTables, you're committing potentially large amounts of server memory to something that may not (and probably will not) be needed again during the life of the cache. SQL Server already caches query execution plans, the generation of which is a big part of a query's cost anyway.
MusiGenesis
+2  A: 

Your best alternative is to let your database do what it's best at: querying and filtering data.

Caching DataTables (especially ones with 3500-4500 rows) is a bad idea for web applications. Calling Select() on a DataTable doesn't reduce the number of rows in the DataTable - it returns a new collection of rows (copied from the original), which means you'll still have the original 4000 rows sitting in the cache. Better to have nothing at all in the cache, and just get the rows you need when the user requests them.

DataTables (and DataSets) are best used with fat clients (usually Windows applications) that need to work with in-memory copies of database data while in a disconnected state.

MusiGenesis