Can a Superkey include things that aren't part of the Primary key?

A superkey can contain non-unique identifiersor Primary Key. But like a compound primary key, the combination must be unique. A normalized dataset has tables that hold data about a particular entity relevant to the purpose of the database. So, e.g., a company with an employee database has a table of employees and might have another table of employee action history (promotions, reviews, salary adjustments, etc.)

A key is a candidate key if it qualifies to uniquely select a specific row in the table such that the data in that table depend completely on that key and nothing else is required.

Suppose the employee table is USA based. It might have two candidates - one might be an employee number, the other the employee's social security number. If a company requires employees to have SSNs..., this could happen.

OK, two candidates - employee number and SSN.

You must pick one to be the primary key, in which case the other becomes just another data field dependent on the choice of key.

In the USA, based on two factors, employee number would probably be picked as the PK leaving SSN as a secondary field. First, there is a legal restriction on use of an SSN because it is personally-identifying information that promotes identity theft. Second, it is possible that a person might have applied for an SSN but not yet gotten it. So SSN can't be relied upon, thus invalidating it as a candidate.

Therefore, the difference between primary keys and candidate keys is that the other candidates lost the race.

A SUPERKEY is essentially an overdetermined key, one that is guaranteed to be a unique selector - but that isn't the smallest it could be. In my employee table example, the combination of (employee number, SSN) would be a superkey. Note that the superkey will often include the primary key.

There is more to it than that. The definition speaks of a superkey that has the same cardinality as that of the whole record as a key. (The ultimate key, totally impractical since you have to know the whole record beforehand to use it as a key...)

"Cardinality" simply says, when I query a table on a single key value, what is the average number of returned records. For a proper prime key, this is always and only 1 for all values that exist in the table at all. (Obviously, it is zero for numbers that don't exist in the table.) Cardinality will be greater than one for non-unique keys. The usual example for the other end of the spectrum is for a table of size N, using "Gender" as a key will give you a cardinality of N/2. Remember, since it is the AVERAGE return size, the table's distribution doesn't have to be perfect.

Hope that helps.

`A` is the only reason results are consistent, and there's no additional benefit to including `B` and/or `C` (or a combination of) because `A` is the primary key. Hence, `A` is the only "superkey".

OMG Ponies 2010-02-01 21:51:33

What you are describing is a *candidate key*. `A` is the only candidate key. `ABC`, `AB`, `BC` are superkeys but not candidate keys.

finnw 2010-02-02 06:40:38

ansaurus

tags:

views:

answers:

Can a Superkey include things that aren't part of the Primary key?

related questions