views:

71

answers:

1

I’m looking for some advice. I recently wrapped up a project where I inherited some terrible code. I got the application running but it’s safe to say there are a number of performance and design issues, specifically with the advanced search functionality. I have now been asked to do a very similar project but much larger in scale. I have an opportunity here to build a much better domain model from scratch and create a much better application as a whole. The question is what would be the best way to implement the advanced search?

The advanced search page brings up a form with two required text fields, 4 optional dropdown lists, and two separate areas with multiple optional checkboxes to filter the results even further.

The current solution uses the two required fields to return a List of objects that I then filter and eliminate based on any optional form values. I then throw the filtered List into the cache and attach the user’s session id. Then on each results page, I have an Html.Helper that displays paging links that use the Take().Skip() approach on the List retrieved from the cache to display 10 results.

The issue I’m having is that the list can get pretty beefy. I’m trying to save database calls for every new page of results by putting it in the cache but I’m not sure if that’s the best way to approach this. Should I just put all the form values in a monster query string and keep making database calls from the GET requests? Should I save the users search criteria in a database or session and use those for every new page of results? Am I right with using the cache to hold such a beefy collection?

I asked a question similar to this here: http://stackoverflow.com/questions/1663616/paging-search-results-with-asp-net-mvc that led me to using the cache. Now that I have a clean slate I would love to follow best practices and do it right from the start. Any suggestions would be great.

+1  A: 

Obviously, the best way to implement advanced search depends on your requirements and on your current estimates with regards to size of search space and frequency of searches. My guess is that your biggest issue is whether or not to implement search as a part of an ORM or in the relational database.

Large SQL queries can be significantly slow, hard to tune, and difficult to debug. If advanced search is a major feature and has lots of complex business rules, then consider implementing search as a part of your ORM. Also, intelligent search would favor the use of an ORM.

Loading lots of large objects into memory can take its toll on both performance and scalability. If the search space is large, then consider using SQL to do the searching.

There are algorithms available for distributed search but they are very sophisticated and your budget, expertise, and delivery schedule may not accommodate that approach. How strategic is advanced search to the success of the project?

Does the search have to be real-time? If not, then consider a hybrid approach where objects are indexed during non-peak times. It is wasteful to have to search through low relevancy words such "the" or "because" so searching through a paraphrased version of the objects may strike a happy medium.

Good luck and have fun!

Glenn
Thanks for your thoughts. The advanced search portion is the most important and often used feature, so therefor extremely strategic to the overall project. You put a pretty heavy emphasis on letting the SQL and ORM do the searching and filtering... so is my design of returning all objects that meet the two required fields values and then filtering that List in C# code based on the optional form values a bad practice?
DM
It sounds like you want to do some filtering in the data tier and some filtering in the web tier. Is that right? That may be a fine approach. It doesn't sound like this is a large development effort. Why don't you go with that approach? If you run into problems later, you can always replace that code with something else. Try to modularize the code such that plugging in an alternative will be easy to do.
Glenn
Good advice. As far as creating a paging system for these search results, would you recommend putting all the form values in a monster query string and doing the database call and filter for every paging request or is my cache system the best practice?
DM
Pretty much the same advice as before. Go with the caching approach because the code commitment is minimal but be prepared to replace that code if you start running into scalability problems. Also, you're not caching much more than what the grid displays, right?
Glenn
Yes, I'm currently caching the bare minimum of information needed to display the results but part of each object in the list is a byte[] array for a thumbnail image. So if the user's search criteria returns 100+ results then the collection gets pretty beefy and tough on the cache. Thoughts?
DM