views:

165

answers:

5

In a database prototype, I have a set of fields (like name, description, status) that are required in multiple, functionally different tables.

These fields always have the same end user functionality for labeling, display, search, filtering etc. They are not part of a foreign key constraint. How should this be modeled?

I can think of the following variants:

  • Each table gets all these attributes. In this case, how would you name them? The same, in each table, or with a table name prefix (like usrName, prodName)

  • Move them into a table Attributes, add a foreign key to the "core" tables, referencing Attributes.PK

  • As above, but instead of a foreign key, use the Attributes.PK as PK in the respective core table as well.

+1  A: 

Normalisation is often best practice in any relational database (within reason).

If you have fields like state (meaning the state within a country), then a reference table like "State" with (id, short_name, long_name etc...) might be the way to go, then each record that references a state only need a state_id column which, as you did mention, is a reference to a record in the State table.

However, in some instances normalisation of all data is not necessarily required as it just complicates things, but it should be obvious where to do it and where not to do it.

Hope this helps.

Mark
"state" meant status (e.g. active/deleted/expired, to track history) - sorry for the confusion
peterchen
I see, well I think the same rules apply, if many tables will be using the same data, then a reference table is still the way to go. It also avoids spelling mistakes as references are ints to other tables, so the only place that a mistake could be is a single row in the reference table.
Mark
+3  A: 

Unless you use the same name or description values across tables, you shouldn't normalize that data. Status types tend to be reused, so, normalize those. For example:

order_status_types
- id
- name
- description

shipping_accounts
- id
- name
- description

orders
- order_status_type_id
- shipping_account_id

preferences
- shipping_account_id
Terry Lorber
+6  A: 

it sounds like you might be taking the idea of normalization a bit too far. remember, it's the idea that you're reducing redundancy in your data. your example seems to indicate you're worried about "redundancy" in the meta information of your database design.

ultimately though, user.name and user.description are functionality different from product.name and product.description, and should be treated as such. for status, it depends what you mean by that. is status just an indicator of a product/user's record being active or not? if so, then it could make sense to split that to a different table.

using the info you provided, if "active/expired/deleted" is merely an indication of state within the database, then i'd definitely agree with a table structure like so:

users            products         status
  id               id               id
  name             name             name
  description      description
  status_id        status_id

however, if status could conceivably be altered to represent something semantically different (ie, for users, perhaps "active/retired/fired", i'd suggest splitting that up to future proof the design:

user_status     product_status
  id              id
  name            name

in short, normalize your data, not your database design.

Owen
Not worried, just wondering - thanks folks :)
peterchen
+1  A: 

I would give each table its own set of columns, even if they have the same names and are logically similar.

If you ever need to change one of the tables by adding or deleting some of these columns, or changing their data type, then you can do it only in the table where it pertains, instead of figuring out how to complicate your shared-attribute table.

Giving each table control of its own attributes promotes Cohesion, which is a good thing. It also avoids your question about which direction the foreign keys go.

As for column naming, it's not necessary or advisable to put prefixes on column names. If you ever do a join that results in columns of the same name coming from two tables, use aliases to distinguish them.

Bill Karwin
+1  A: 

I've always given each table a 3 letter code which I then use in all field names. That way in the product table I have prdname, prddescription, prdstatus, and in the vendor file I have venname, vendescription, venstatus. When things get joined, there is no need to worry about same named fields.

Of course, the tables all have a field named plain old id and the product table would have a field named venid that refers to the id field in the vendor table. In this case I don't put the prd prefix on it because venid makes perfect sense and is nonambiguous.

Michael Dillon