views:

127

answers:

4

Suppose I have two tables in a parent - child relationship. Let's call them "sites" and "buildings" where a "site" is a parent of one or more "buildings". I want the IDs to be unique across databases so I will be using GUIDs for the IDs.

Should I prefer generic names for the ID fields in the tables or strongly typed names and why?

Example 1 (Generic Names):

CREATE TABLE sites (
   id VARCHAR(38) NOT NULL,
   -- other attributes
);

CREATE TABLE buildings (
   id VARCHAR(38) NOT NULL,
   parent_id VARCHAR(38) NOT NULL,
   -- other attributes
);

Example 2 (Strongly Typed Names):

CREATE TABLE sites (
   site_guid VARCHAR(38) NOT NULL,
   -- other attributes
);

CREATE TABLE buildings (
   building_guid VARCHAR(38) NOT NULL,
   site_guid VARCHAR(38) NOT NULL,
   -- other attributes
);
+1  A: 

One convention is to prefix all of the fields in the buildings table with "building_", so you'd have "building_id", "building_name", etc. Foreign keys, such as "site_id", would keep their original name, making it completely obvious what's going on.

edit

I should mention that I chose "_id" over "_uid" or "_guid" because there's only one ID field so I didn't see any reason to emphasize its data type.

Steven Sudit
The ANSI syntax example from Tony Andrews offers another compelling reason to use this convention: it's not only supported by query-generating tools, but also as a simple way to specify the join itself.
Steven Sudit
+4  A: 

I prefer simple id, as this is standard conventions, here is a great link for all database naming conventions: http://weblogs.asp.net/jamauss/pages/DatabaseNamingConventions.aspx#Columns

"Rule 2a (Identity Primary Key Fields) - For fields that are the primary key for a table and uniquely identify each record in the table, the name should simply be “Id“ since, that's what it is - an identification field. This name also maps more closely to a property name like “Id“ in your class libraries. Another benefit of this name is that for joins you will see something like "Customers JOIN Orders ON Customer.Id = Orders.CustomerId“ which allows you to avoid the word “Customer“ again after the Customer table."

Patcouch22
+1: exactly how I prefer it
Fredrik Mörk
Compare that with "Customers JOIN Orders on Customer.CustomerId = Orders.CustomerId". The original ID and its foreign key usage are identical, removing any ambiguity and allowing many query design tools to automatically figure out the join fields.
Steven Sudit
Oh, and I'd be careful with mixed-case in SQL, since it's not case-sensitive. Safer to use all-lower with underbars, even though it looks worse.
Steven Sudit
@Steven: why is lowercase safer than mixed case when it's not case sensitive?
Fredrik Mörk
@Frerik: Ambiguity. As a contrived example, consider that "FuguId" and "FuGuid" would match if compared without case sensitivity. But "fugu_id" and "fu_guid" are safe, always.
Steven Sudit
@Steven: I see your point; thanks for clarifying :o)
Fredrik Mörk
Thanks for the quick reply. What might a solution be if we wanted to complicate the matter a little by adding a human readable (e.g., integer ) id into the mix? This would not be used for maintaining relationships, but simply to provide something easier to reference verbally than a guid.
Ryan Taylor
Are we talking about a field that is never returned to the user but is instead used by a DBA for convenience when hand-editing the database?
Steven Sudit
The PK (GUID) is never returned to the user. The integer ID may actually be returned to the user in instances when other identifiable information is not available ( such as the building name ).
Ryan Taylor
@Ryan: I'm sure I don't understand your intent. There are times I've used a regular sequential identity field as an internal alternative to the GUID. This means that only the GUID is ever used for joins, returned to the end user, or specified to select a row. The integer field is used when I'm hand-editing the database and want to type in a quick UPDATE command to change a value, without having to deal with GUID's. That's how I've used it, anyhow. How exactly are you using it?
Steven Sudit
@Steven: We will never be exposing the GUID to the end user, but it will be used to maintain / define relationships. We find that the GUID is too difficult for users to discuss. The integer ID will be exposed to the user but not used for maintaining relationship. It will strictly be used so that we can refer to an end user exposed entitiy without referencing a 38 character string.
Ryan Taylor
@Ryan: Ok, I'll bite: Why use GUID's at all, then?
Steven Sudit
@Steven: We need to merge records from N databases into M databases, including partial merges. Essentially, database replication. We had problems using integer PKs as the PKs changed during the merge process into our central database. This broke the relationships between entities. Our goal is to create PKs and FKs that are merge friendly ( GUIDs ) and provide human friendly IDs for referring to specific entities.
Ryan Taylor
+1  A: 

It's mainly a matter of taste. There should be some consistency over the project. Most important are:

  • that you can remember the names of repetitive columns that you don't need to look into table definitions to type simple sql queries
  • that you don't need to look into code or documentation to roughly understand the meaning of a column

Personally I prefer ids with the same name all over the project. It's also easier to write generic code, that is independent of a specific table.

For foreign keys I usually use the <ForeignTable>_FK pattern. If there is more then one foreign key to the same table, I use the role name, as usual in object oriented design, <Role>_FK, for instance CurrentUser_FK.

Stefan Steinegger
+2  A: 

I prefer to use building_id, site_id so that the column name defines its contents more explicitly than just "id". This also makes it possible to use the ANSI join "using" syntax:

select site.site_id, building.building_id
from building
join site using (site_id);

Another advantage is that when such columns are used in queries (or views or subqueries), they don't need re-aliasing so often - like this:

select site.id as site_id, building.id as building_id
from building
join site on site.id = building.site_id;
Tony Andrews
Interesting. I wasn't aware of this bit of ANSI syntax. Thanks.
Steven Sudit
@Tony: I'm trying to figure out what currently supports this syntax. Do you have any references you could direct me to that define it and maybe give it a name I can then search for?
Steven Sudit
See http://en.wikipedia.org/wiki/Join_(SQL) which says "The USING clause is supported by MySQL, Oracle, PostgreSQL and SQLite."
Tony Andrews
Thanks. Conveniently, it's not supported by the database I actually use. :-)
Steven Sudit
Ultimately, this is the direction we chose mainly because of the self-documentation that this provides. Our DB admin adds that clients do not always read the documentation ( ERD ) and so this sort of self-documentation can be very helpful in reducing long term support and maintenance.
Ryan Taylor