views:

1116

answers:

5

I'm looking to introduce some database denormalization to a heavily normalized system.

I have a large set of databases that have evolved over ten years and are under increasing amounts of load so I'm looking to improve PERFORMANCE and possibly reduce the complexity of some queries.

It is not uncommon to do 10 joins to accomplish any given task in a stored procedure. I've been told that more then 6 stinks.

Should I keep the table structure as is and provide some materialized views or denormalized "cache" tables.

Some advice on best-practices or a push in the right direction would help.

Thanks

+10  A: 

You don't say what the problem is. Is it performance? If so, on what tables?

Is it really the joins that are causing the problem? Or is it the stored procedures? You don't know (or at least, you don't say).

Best practice: figure out where your bottlenecks are first, before trying to solve a problem you haven't yet diagnosed.


On edit: I'm reminded of a time when on a job when we had performance problem. very slow stored procs, that could take minutes to complete. It turned out that these sps were doing utterly normal table updates, but using cursors. For stuff as simple as update t set c = c + 1 where id = n.

So to do an update, we were cursoring through a bunch of rows with a costly update cursor and doing an declare cursor for "select c from t where id = n" for update; then an open cursor and a read and an error check and a loop with a read and error check and then select c into @c; @c = c + 1; update t set c = @c where current of cursor; for each and every update.

Turned out the guy who wrote this didn't realize that we could just issue an update statement. And so he'd written dozens of these slow stored procs. We didn't even need to get rid of the stored procs (though that would have gained us some speed too, it would have changed our client); we just got rid of the cursors, replacing them with update statements. Performance problem gone.

tpdi
+1  A: 

I have to agree, 10 joins is evil and will kill your performance.

Much depends on how your applications interact with the databases. If your code sticks strictly to stored procedures (no direct SQL calls) then your life will be easier.

If you are going to the trouble of denormalization, I would not add new "cache" tables. This is really just patching the problem. I would go ahead and formulate a complete plan to denormalize the database with a new optimized schema.

Dave Swersky
+2  A: 

Some things to investigate

  • Benchmark all query run times - give yourself a metric to compare against.
  • Investigate indexing is done properly.
  • Read up on table partitioning.
  • Explore sharding as an option.
  • Look at your joins closely. Are you always joining the same tables together? If the answer isn't so obvious, you still can create logical divisions (like your denormalized tables) using views.
Elijah
+1  A: 

try to index heavily and wisely. try to use indexed views. Try precompiled stored procedures. if this fails, remember that denormalizing and caching often require heavy synchronization works, so you should carefully look to each case before doing it.

pomarc
A: 

I would agree with Elijah. Be sure you benchmark anything you are about to change.

Also, depending on your setup indexed views may be an option.

William Edmondson