views:

50

answers:

3

Hello,

I have a database with some information which are repeated in some tables.

I want to know if it's interesting to create a table with this information and in the other table, I put only the id.

It's interesting because with this method I haven't got redundance. But I will have to do many joints between my tables in my request, and I'm afraid my request will be more slow.

(I work with symfony if it changes something)

+1  A: 

It sounds like the 'information' in question is data that makes up key values. If so, it sounds like the database designer likes to use natural keys and that you prefer to use surrogate keys.

First, these are both merely a question of style. If the natural key values are composite (i.e. involve more than one column) and are included in other columns for data integrity purposes then they are not redundant.

Second, as you have observed, when it comes to performance of surrogate keys you have to weigh the advantage of the more efficient data type (e.g. a single integer column) against the degrading performance of needing to write more JOINs. Note that using surrogates tends to make constraints more troublesome to write e.g. when the required values for a rule is in another table and you SQL product doesn't support subqueries in CHECK constraints then you will need to use a trigger which degrades performance in a high activity environment.

Further consider that performance is not the only consideration e.g. using natural key values will tend to make the data more readable and therefore make the schema easier to maintain because the physical model will reflect the logical model more closely (surrogate keys do not appear in the logical model at all).

onedaywhen
A: 

You're talking about Normalisation. As with so many design aspects it's a trade-off.

Having duplication within the database leads to many problems - for example how to keep those duplicates in step when updating data. So Inserts and Updates may well go more slowly because of the duplication. Hence we tend to normalise the database to avoid such duplication. That does lead to more complex queries and possibly some retrieval overhead.

Modern database products tend to do such queries really well if you take a bit of care to have the right indexes in place.

Hence my starting position would be to normalise your data, avoid duplication. Then in a special case perhaps denormalise just pieces where it really becomes essential. For example suppose some part of you database is large, mostly queried rather than updated (eg. historic order information) then perhaps denormalise that bit.

djna
A: 

If you don't normalise

  • How are you going to store values that could potentially be used?
  • How are you going to separate "Lookup value" from "Look up value from "LookUpValue" etc
  • You'll be slows because you are storing variable length string "Lookup value" across many rows, rather than a nice tidy integer key

This is the more practical points to the other 2 answers...

gbn