views:

279

answers:

5

Before denormalizing, I'm wondering what effect this is going to have on the following:

  • Query response time
  • Width of rows in the database
  • Joins necessary for a result
  • Number of queries necessary for requests to complete

It seems, if I am not mistaken, that all of these will be reduced?

+2  A: 

Your assumptions are correct. Denormalizing will increase performance, but the downside is that it reduces correctness.

This topic has been discussed at length in this previous stackoverflow question

ichiban
+1  A: 

Would not the width of the rows in the database increase?

Denormalizing should only be done as an optimization of last resort.

It will increase the size of the database and also increase data duplication and make it harder to keep data up-to-date and in sync.

Jeffrey Hines
Which is why it's well suited to a data warehouse (where you don't keep anything up-to-date), and badly suited for an on-line transactional database. Different uses do better with different designs.
David Thornley
The width of the rows will increase? Sorry. Could you explain?
Gunther Krauss
@Gunther If you denormalize then you will be adding additional columns to your table with the denormalized data, therefore your rows will have more columns and be wider. This also means that fewer rows will fit in a page and will take more memory.
Jeffrey Hines
A: 

Denormalization is a fairly broad term, so there's no quick answer to your question.

Where you're avoiding a join, it's likely to be faster to retrieve. However you're adding to the complexity of updates and data maintenance so there's a tradeoff.

You seem to be asking if the number of queries or the speed will be improved on a systemwide basis, which also isn't quite the sense of it - it's better to think of it as a specific local optimization to a specific bottleneck in your application, i.e. something to consider to make a single query or set of queries run faster.

Steve B.
+1  A: 

If you happen to work in Microsoft SQL Server, I highly recommend keeping your tables normalized and using so called Indexed Views for denormalizations. These are semi-permanent data structures that are updated whenever the underlying tables are updated. This way you keep the best of both worlds -- normalized schema AND fast denormalized data!

Something similar may also exist for Oracle, not sure.

zvolkov
Materialized views? I'm not completely sure what Indexed Views are, but they sound similar.
David Thornley
A: 

Apart from "with of rows in the database", your answers are all correct.

"Denormalization" means stashing more information in a row than is strictly necessary. That is impossible without increasing row width.

But the most important thing is : you did not ask all the questions.

You also wonder whether if you deliberately introduce redencancy in your database, then shouldn't you also be introducing some extra database constraints at the same time, to prevent your database from containing corrupt data (data that contradicts itself) ?

And if the answer to that question is 'yes', then perhaps you should also ask yourself whether enforcing all those additional constraints that are only needed because you introduced the redendancy, will not cause you to incur a similar (or much graver) loss in update performance.