views:

34

answers:

2

I have a website backed by a relational database comprised of the usual e-commerce related tables (Order, OrderItem, ShoppingCart, CreditCard, Payment, Customer, Address, etc...).

The stored proc. which returns order history is painfully slow due to the amount of data + the numerous joins which must occur, and depending on the search parameters it sometimes times out (despite the indexing that is in place).

The DB schema is pretty well normalized and I believe I can achieve better performance by moving toward something like a data warehouse. DW projects aren't trivial and then there's the issue of keeping the data in sync so I was wondering if anyone knows of a shortcut. Perhaps an out-of the box solution that will create the DW schema and keep the data in sync (via triggers perhaps). I've heard of Lucene but it seems geared more toward text searches and document management. Does anyone have other suggestions?

A: 

Materialized Views are what you might use in Oracle. They give you the "keeping the data in sync" feature you are looking for combined with fast access of aggregate data. Since you didn't mention any specifics (platform, server specs, number of rows, number of hits/second, etc) of your platform, I can't really help much more than that.

Of course, we are assuming you've already checked that all your SQL is written properly and optimally, that your indexing is correct, that you are properly using caching in all levels of your app, that your DB server has enough RAM, fast hard drives, etc.

Also, have you considered denormalizing your schema, just enough to serve up your most common queries faster? that's better than implementing an entire data warehouse, which might not even be what you want anyway. Usually a data warehouse is for reporting purposes, not for serving interactive apps.

Peter Recore
Thanks Peter. The query has been analyzed and appropriate indexes added. Unfortunately I can't change the schema, that's why I'm interested in "replicating" to some denormalized, read-only structure.
Keith
SQL Server 2008 supports Materialized Views. They are probably worth looking into, assuming that adding them doesn't count as "changing the schema". they are the closest thing I can think of to a "shortcut" to a replicated DB.
Peter Recore
I created an indexed view (WITH SCHEMABINDING) a.k.a. "materialized view" which helped out tremendously. I also tweaked the SP so fewer joins were required.
Keith
+1  A: 

How big is your database?

There's not really any shortcuts, but dimensional modelling is really NOT that hard. You first determine a grain and then need to identify your facts and the dimensions associated with the facts. Then you divide the dimensions into tables which allow you to have the dimensions only grow slowly over time. The choice of dimensions is completely practical and based on the data behavior.

I recommend you have a look at Kimball's books.

For a database of a few GB, it's certainly possible to update a reporting database from scratch several times a day (no history, just repopulating from a 3NF for a different model of the same data). There are certain realtime data warehousing techniques which just apply changes continuously throughout the day.

So while DW projects might not be trivial, the denormalization techniques are very approachable and usable without necessarily building a complete time-invariant data warehouse.

Cade Roux