views:

75

answers:

5

Hi All,

I am currently trying to design a DB for a new project I am working on. My question stems around Normalizing and Denormalizing....what I am trying to figure out is should you always normalize to the fullest extent? The data will be information that the users will be revising very very rarely and will most often be interacting with the data view reports / views.

So my first attempts were normalizing the data, but then when I got deeper into the project and realized that they would very rarely be adding / changing data I thought it might be better to just denormalize the structure from the start.

Any words of wisdom.....there could be a chance that in the future they will interact with the data more than today, but still it will not be like an order system or CRM system.

Always appreciate the input.

--S

+4  A: 

The usual advice for designing OLTP databases is normalise to 3NF.

Whereas, best practice for a DataWarehouse or OLAP database is to denormalise into a Fact Table and supporting Dimension tables (a star schema).

So it depends on your precise usage scanario. From your description it sounds like you have something closer to OLAP.

I would normalise everything initially and then if there are performance problems look at denormalising.

SQL Server indexed views could possibly help you if your reports involve large numbers of aggregations, and you should create indexes to try cover your query workload.

Mitch Wheat
+1 all great points. I think your last two, "normalize everything initially and then look at de-normalizing if necessary" and "views for reports" will be very helpful to him.
JohnB
Thanks Mitch...very helpful. My gut was telling me I should go the normalization route, but it seemed so enticing to denormalize...really appreciate the info.
scarpacci
Indexed views: I think I need to look into that myself. I am currently working on a star schema with 20 million rows and I need to improve query times. Awesome links! (+1 again if I could)
JohnB
+2  A: 

Benefits for normalizing:

  1. save disk space
  2. less redundant data means more likely to have consistent data
  3. easier to implement changes to your schema and/or data when necessary

Benefits of de-normalizing

  1. easier SQL queries (don't need a ton of joins)
  2. joins are expensive (slow), and with a normalized db, you really should have indexes on all the join fields (not that indexes are a ton of work)
JohnB
Good points. +1.
David Stratton
+3  A: 

Normalization and proper design will save you problems down the road. Even if this particular project doesn't seem to need it, do it anyway, to be sure you're in the habit, and getting practice,

Most of the apps I've had to re-write were having issues due to a poorly designed DB.

So, as it's been said many times here, normalize as much as possible, unless it really hurts performance, and even then, only if there is a compelling reason to de-normalize.

David Stratton
+1: "do it anyway, to be sure you're in the habit, and getting practice" (I feel like I learned a lot about databases by following that advice)
JohnB
Thanks David...very good point
scarpacci
A: 

I always design to 3NF (and urge others to). If you strike specific performance problems, you can then (and only then) figure out the merits of reverting to non-normalised data. You don't revert just for the sake of reverting, since that introduces its own set of problems.

The point of having normalised data is to avoid inconsistencies in the data. You can revert for performance but you then usually have to introduce "tricks" (like triggers) to ensure data consistency (the C is ACID).

paxdiablo
A: 

Fifth Normal Form (5NF) is the basis of most good database design and that's what you should aim for except when you find exceptional reasons not too. The only potential advantage of 3NF is that it preserves some dependencies which 5NF does not. In those (generally rare) cases you need to decide whether it is worth preserving the dependency or whether it's better to normalize anyway and enforce the dependency through additional constraints and business logic.

This advice is just as true of data warehouse databases. Denormalization is often used as a strategy for data marts, but a data warehouse should generally be in Normal Form.

dportas