views:

26

answers:

3

I have a legacy table that has as a part of its natural key a column named <table_name>_IDENTIFIER and it seems like it would be confusing to create a surrogate key named <table_name>_ID or ID so I'm leaning towards naming it SURROGATE_KEY. All my other tables use the <table_name>_ID syntax. Any better suggestions?

+4  A: 

Don't call it SURROGATE_KEY. That is meaningless in any other context. I'd stick with <table_name>_ID. Yes it's a little confusing. But, given your established convention, anything else would be confusing too.

Marcelo Cantos
How is "<table_name>_ID" any more meaningful than "surrogate_key"? At least the later describes what it is.
Thomas
It tells you what table it belongs to. Since this key is likely to appear as a foreign key elsewhere, it's better to call it the same thing at both ends (allowing the use of SQL's `USING` keyword) than to have `SURROGATE_KEY` in one table and `<table_name>_SURROGATE_KEY` in the other.
Marcelo Cantos
If you have a long query joining several tables by their various surrogate keys, it is a LOT easier to understand the code if the names of the joining columns indicate something about the nature of the data they contain (i.e. this is an Id for table MyTable), than if they all had the exact same name (i.e ID, or SURROGATE_KEY). Yes, you can tell where they come from by the table name or alias (MYTABLE.SURROGATE_KEY, or mt.ID), but why not make it easier on yourself (and those poor slobs debugging your code later on, who might be me) and include extra information?
Philip Kelley
@Philip Kelley - Why not prefix all your columns with table names? Just as MT.ID is hard to debug, so is MT.Name. If you think that prefixing columns is a bad idea, then it is also a bad idea on an ID column.
Thomas
@Marcelo Cantos - You cannot use the column (any column) without knowing the table to which it belongs! You must have the entity. So, `TABLENAME.TABLENAME_SURROGATE_KEY` is just redundant.
Thomas
@Thomas - Ultimately, it's a style issue. There is no right or wrong answer.
Philip Kelley
@Marcelo Cantos: I'm still really torn between the '<table_name>_ID' and 'SURROGATE_KEY' options, but decided to go with your suggestion for consistency sake. I'll put comments on the columns (does anybody except me even consult those?) to disambiguate them.
RenderIn
@Philip Kelley - Agreed. At the end of the day, the biggest issue is consistency. Either with prefixes or not, do it consistently.
Thomas
I hate id columns named ID. When you have foreign keys it is useful for the FK and the PK to have the same name, that's why people use tablename_ID, not for the orginal table but to make it easiwer to find in the FK table.
HLGEM
Phillip: I strongly favour the use of `USING` on DBMS's that support it (which seems to be just about everyone except SQL Server). It makes multiway joins a whole lot cleaner, but is only possible if your join columns are uniquely named. I even have a soft spot for `NATURAL JOIN`, which, again, is only usable if the join columns have matching names.
Marcelo Cantos
+2  A: 

I might suggest that you go with your standard: <table_name>_ID

Eventually, the legacy table will not be the driving force, and it will be the IDENTIFIER column that will look odd, which is what you want, as opposed to that - 'oh yeah, i need to use surrogate_key for that thing instead of id...' moment.

Randy
A: 

First, I would not include the table name in my columns. A column is an attribute which requires the context of the entity to which it belongs. Having a "name" for example without the context to which it belongs is of no use. You need to know it is a Person's name or a Company name etc. and you have that in the name of the entity itself. Thus, I would not prefix columns with the name of the table in which it is declared.

That leaves you with choices like "Id", "Key", "SurrogateKey", or perhaps "SystemId" which are all equally vague. At least "SurrogateKey" describes what it is which is a bonus. That name will make sense to a DBA but perhaps not a developer (although they should understand the concept). Of those choices, I'd be inclined to use "Id" and find a way to change <table_name>_Identifier to something more descriptive.

Thomas
I agree with this, but I make an exception for surrogate keys. In general, these are only used to "uniqueify" tables and for joins and foreign key constraints, and it makes complex code more legible if you can *quickly* tell (from your ON clauses) where the joining columns are "coming from".
Philip Kelley
@Philip Kelley - That is what table names are for. TABLE.TABLE_ID is redundant. Use good table aliases, and this isn't a problem. IMO, it is easier to see which side is the PK and which side is the FK if the PK is simply `Id`.`
Thomas
@Thomas - Ultimately, it's a style issue. There is no right or wrong answer.
Philip Kelley