views:

45

answers:

2

I am building UI for a large product catalog (millions of products).

I am using Sql Server, FreeText search and ASP.NET MVC.
Tables are normalized and indexed. Most queries take less then a second to return.

The issue is this. Let's say user does the search by keyword. On search results page I need to display/query for:

  1. Display 20 matching products on first page(paged, sorted)
  2. Total count of matching products for paging
  3. List of stores only of all matching products
  4. List of brands only of all matching products
  5. List of colors only of all matching products

Each query takes about .5 to 1 seconds. Altogether it is like 5 seconds.

I would like to get the whole page to load under 1 second.
There are several approaches:

  1. Optimize queries even more. I already spent a lot of time on this one, so not sure it can be pushed further.

  2. Load products first, then load the rest of the information using AJAX. More like a workaround. Will need to revise UI.

  3. Re-organize data to be more Report friendly. Already aggregated a lot of fields.

I checked out several similar sites. For ex. zappos.com. Not only they display the same information as I would like in under 1 second, but they also include statistics (number of results in each category).

The following is the search for keyword "white" http://www.zappos.com/white

How do sites like zappos, amazon make their results, filters and stats appear almost instantly?

A: 

you could try replacing you aggergate queries with materialized indexed views of those aggregates. this will pre-compute all the aggregates and will be as fast as selecting any regular row data.

KM
A: 

.5 sec is too long for an appropriate hardware. I agree with Aaronaught and first thing to do is to convert it in single SQL or possibly Stored Procedure to ensure it's compiled only once.

Analyze your queries to see if you can create even better indexes (consider covering indexes), fine tune existing indexes, employ partitioning.

Make sure you have appropriate hardware config - data, log, temp and even index files should be located on independent spindles. make sure you have enough RAM and CPU's. I hope you are running 64-bit platform.

After all this, if you still need more - analyze most used keywords and create aggregate result tables for top 10 keywords.

Amount Amazon - they most likely use superior hardware and also take advantage of CDN's. Also, they have thousands of servers surviving up the content and there is no performance bottlenecks - data is duplicated multiple times across several data centers.

As completely separate approach - you may want to look into "in-memory" databases such as CACHE - this is the fastest you can get on DB side.

IMHO