views:

167

answers:

5

I've lately seen a database where there was a table Types with columns Id, Key and Name.

Id was just an Id of the type, Key was a short key name for the type, for example "beer", and the Name was text that could be displayed for the user (for example, "Our greatest beers"). Id was of course unique and was a primary key for this table. Key was also unique. Other tables were always linked with table Types using its Id column, but stored procedures were always using Key for filtering (e.g. "X inner join Types on X.type_Id = Types.Id where Types.Key = 'beer' " instead of "X.type_Id = 3").

I thought about it as a bad approach. I would use Id rather than Key, even if i knew Key is unique. I think it would be possible (and ok) that the Key changes but the Id shouldn't change as it is used in another tables for linking. Are there any rules for not doing this? I mean if we changed Key "beer" to "beers", some of the stored procedures would stop working properly (and there actually was such a situation). For me it was quite intuitive that if the Id identifies the row in a table, we should always use the id, as the other attributes may change if needed and it should not cause problems. Am I right?

+1  A: 

There's only one (single field or multi field) primary key and it must be used always when doing JOINS. Anyway, in the specific domain (about which you are not telling us) it may make some sense to search for another field in specific queries. If, as you say, those fields can change, it's bad practice to hard code this queries.

tekBlues
+1! tekBlues identifies what's "bad" about the approach, it's hard coding values into the stored procedure, whether it's id=3 or key='beer', when those values can change. (the primary key should be immutable, that is, once assigned, it should not change.)
spencer7593
+5  A: 

Key is a more meaningful and understandable way to access the data in the tables. Let me put it this way: would you rather debug this

SELECT ColumnA, ColumnB
FROM Table T
INNER JOIN Keys K
ON T.KeyId = K.KeyId
WHERE K.Key = 'Beer'

Or

SELECT ColumnA, ColumnB
FROM Table T
WHERE T.KeyId = 103461

When you have no idea what '103461' represents?

The same goes for stored procedures and other parameterized queries. Would you rather see

EXEC get_items_by_category 'Beer'

Or

EXEC get_items_by_category 103461

? The answer should really be obvious. Good, maintainable code is self documenting, and arbitrary IDs can't give you that.

Welbog
comments are very useful in sql... "T.KeyID = 103461 --Beer"
dotjoe
but when we changed some key because there was a spelling mistake all stored procedures stopped working! Do you think it would good to operate with people's last names (knowing they must be unique) i nstored procedures, just to make it more readable? what happens if you change person's last name because there was a spelling mistake? it is an analogous situation
agnieszka
But natural keys change over time and are very bad for maintaining data integrity. I would alomst never chose a natural key over a surrogate for joining, but then I've worked with hundreds of databases over 30 years and I know exactly what a mess natural keys can make.
HLGEM
I'm not joining on any natural keys, as you can see. From a maintenance perspective, using natural keys is more meaningful than using arbitrary keys. If you have a known unique constraint, and you know your natural keys aren't going to change often, use the natural keys for lookups. If your natural keys are fickle, then I would argue that they aren't keys at all.
Welbog
A: 

Sounds like a bad design. I don't think there's fundamentally evil about it, after all, you can have multiple unique indexes for a table.

I can see where the id was an IDENTITY surrogate, you could not count on a certain value existing, so the natural key is used instead, but like you said, there's no guarantee that it exists, so your logic depending on it could break.

However, in that case, there are other designs which might work, like an IsBeer table which contained all the ids of "Beers" or a Flags table which had an IsBeer column (and an IsFood column), or something like that. Again, you cannot depend on them existing at all, but you would at least not have to worry about a column change breaking the logic, since you would have a FK relation.

To take welbog's example further, wouldn't you rather include the information from your application logic structurally in the database like:

SELECT ColumnA, ColumnB
FROM Table T
INNER JOIN Keys K
    ON T.KeyId = K.KeyId
    WHERE K.IsBeer = 1
Cade Roux
A: 

I agree with you in general, and disagree with darasd. I assume the id columns are surrogate keys - in other words, they are internal to the database only and never presented to the user. I think this (use of surrogate keys as the absolute unique identifiers) is the preferred approach in most scenarios. It allows you as you said to be able to better handle situations when the real-world descriptive key (eg "bees") is supposed to be unique but then in reality turns out not to be. The classic example of this is SSN when it is intended as a uniqueid but turns out in reality not to be.

PaulR
+1  A: 

Think of ID values as if they're memory addresses holding some variable's value. Would you prefer to refer the value by the variables' name or the memory address?

Personally I would never depend on a specific ID if it's auto increment. Especially if a row might be deleted. What if you dump the data one day and import it again, perhaps because you want to nake a fresh install. The SQL server will re-enumerate the ID and all your queries are broken.

EDIT: (answering your comment)

So your main argument is that in your scenario an ID will never change and a key can change? I would say that's bad design :). If you have to live with it, it seems obvious to use the ID, even if it's nondescriptive. Imho, the key has no value in this scenario if changing it is allowed and breaks lots of queries.

VVS
ok but the question is what happens more often - clearing the database and inserting data again or spelling mistakes and data change ("beer" to "beers")?
agnieszka
Where are you getting the value "beer"? If you have a closed loop on the database, it shouldn't be a problem. Update "beer" to "beers" and the list of available keys to pass to the stored procedure will change.
Peter
you don' read carefully. There is "beer" hardcoded in stored procedure: "where Types.Key = 'beer'". So if the "beer" changes, the stored procedure stops working
agnieszka