views:

1067

answers:

4

I have some mysql tables that have auto incrementing id's that are primary keys, but I notice that I never actually use them... I used to think that every table must have a primary key so I guess that is why I created them before. Should I remove them all if I don't use them at all?

+8  A: 

Unless you are running into space problems I wouldn't remove them.

They are a life saver in case you by mistake (or oversight) populate the database with repeated/wrong data.

They also help to have related tables, where you reference the content on one table through the autogenerated id.

This is assuming you have indexes for the other columns you use to actually query the data (if you don't, then more reason to keep the autoincrement ids and use them!).

Vinko Vrsalovic
There has to be a flaw in the DB design if he isn't using those keys at all.
ChaosPandion
Unless he is consistently and correctly using natural keys. But yes, I thought the same when I read the question.
Vinko Vrsalovic
Generally these autogenerated id's are what you use to establish relationships between tables. That way anything else in the row can change and the relationship stays the same. I agree with Chaos.
Daren Schwenke
-"They are a life saver in case you by mistake (or oversight) populate the database with repeated/wrong data." That's true, didn't think of that.-"They also help to have related tables, where you reference the content on one table through the autogenerated id." What do you mean by that?
Roger
"There has to be a flaw in the DB design if he isn't using those keys at all."-Well for example, I have a table with ratings. I have an auto increment PK for the rating id. I never actually use this id... I will never joining any tables with this id, nor do my queries really need to use this id. Is something wrong?
Roger
Without knowing the context of this rating table I couldn't tell you. Typically I might have a rating ID product ID user ID and rating. To update the row you query the rating ID directly for a clustered Index update.
ChaosPandion
@Roger: If you don't know what a related table is, Chaos is probably right. Read about foreign keys: http://www.visualcase.com/kbase/database_basics_-_foreign_keys.htm http://en.wikipedia.org/wiki/Foreign_key
Vinko Vrsalovic
"To update the row you query the rating ID directly for a clustered Index update." I'm not quite familiar with clustered indexes nor am I advanced enough to understand that =P, but this is a myisam table, so I'm not sure if it supports that.
Roger
Its actually very simple.Update Rating Set Value = @Value Where RatingID = @RatingIDAlways using the primary key to query and update the table will give optimal performance.
ChaosPandion
Oh that's what you meant. Yes, I know what a related table is. :P The reason I would remove the PK's in the first place is of course noticing that they I do not use them in any situation. I don't join any tables with them. I don't query for the PK, it's not used in any where clause, group by clause, etc. I agree with that accidentally populating wrong data it would be helpful to have the PK's, but if I do need the expand later, can't I simply add the auto increment PK and it will automatically fill a new PK with incremented numbers.
Roger
@Chaos: Oh, people can rate, but they can't change their rating so I probably wouldn't need to update.
Roger
@Roger: You really mean you have a system where every table is unrelated to every other table? Are you sure? If they are related you should use the relations.
Vinko Vrsalovic
@Vinko: Sorry for the misunderstanding. No of course there are related tables, but let's say I have 30 tables. On these tables they all have autoincrement integers as the PK. 20 of them may require joins to other tables, and require PK's for various other things. The other 10 tables have the PK, but don't use it at all.
Roger
@Roger But if you have a rating tied to a user, then a user's rating would be "RatingID", and the Rating Table would have a "ID" and a "Rating" column. The ID and the RatingID would be related.
George Stocker
@George You actually don't need to be a user to rate. But assuming you needed to be a user, why would you need the ID, don't really understand. :S
Roger
Before I was looking at some of the tables from famous scripts. Not all their tables contain a PK either, which was another reason that made wonder if it really is necessary for every table to have an autoincrement PK.
Roger
A: 

Interesting!...

I seem to hold a minority opinion here, getting both upvoted and downvoted to currently an even 0, yet no one in the majority opinion (see responses above) seems to make much of a case for keeping the id field, and the downvoters didn't even bother leaving comments hinting at why doing away with the id is such a bad idea.

In their defense, my own original response did not include any strong argument as to why it is ok to do away with the id attribute in some cases (which seem to apply to the OP). Maybe such a gratuitous response makes it, in of itself, a downvotable response.

Please do educate me, and the OP, by leaving comments pro or against the *systematic* (and I stress "systematic") need to include auto-incremented non-semantic primary keys in all tables. A promised I returned and added to my response to provide a list of reasons why it may be detrimental to [again, systematically] impose a auto-incremented PK.

My original response: You bet! you can remove these!

Before you do anything to the database make sure you have a backup, in particular is the DB size is significant.

Use the ALTER TABLE statement to remove the id in the tables where you want to remove it. Specifically
ALTER TABLE myTable DROP COLUMN id (you also need to remove the PK constraint before removing the id, if the table has such a constraint)

EDIT (Added later)
There are many cases where it just doesn't make sense to carry along an autoincremented ID key, regardless of the relative little extra storage requirement these keys add.
In all these cases, the underlying implication is that

  • either the data itself supplies a primary key,
  • or,   the application manages the key generation

The key supplied "natively" in the data doesn't necessarily neeeds to be a single column key, it can be a composite key, although in these cases one may wish to study the situation more closely, particularly is the overal key is a bit long.

Here are some of the drawbacks of using an auto-incremeted primary key in lieu of a native or application-supplied key:

  • The effective data integrity may go unchecked
    i.e. the server may allow record insertions of updates which create a duplicated [native] key (eventhough the artificial, autoincremented primary key hides this reality)
  • When relying on the auto-incremented PK for the support of joins between tables, when part of the [native] key values have to be updated...
    ...we either create the need of deleting the record in full and and re-insert it with the news values,
    ...or the risk of keeping outdated/incorrect links.
  • A common "follow-up" with auto-incremented keys is to create a clustered index on the table for this key. This does make sense for tables without an native or application-supplied primary key, so so much for data sets that have such keys. Effectively this prevents choosing a key for the clustered index which may be more beneficial for the most common query patterns.
  • Migrating tables with an auto-incremented key can made more difficult depending on the DBMS (need to declare the underlying column as plain integer, prior to copy, then need start again the autoincrement...)
  • For narrow tables, i.e. tables with a few columns only, the relative cost of the auto-incremented PK can be significant, and impact performance in a non negligible fashion.
  • When inserting new records along with associated records in related tables, the auto-incremented key needs to be obtained after the insertion of the main record, before the related records can be inserted; the logic is simpler when the column values supporting the link are known ahead of time.

To summarize, the idea that so long as the storage can carry the [relatively minimal] extra "weight" of the artificial primary key, we should include and use such a key, is not without drawbacks of its own.

A final consideration is that just like it is rather easy to remove such keys when we don't need them, they too can be easily added, post-facto, when/if it becomes apparent that they are useful in a particular situation. Neither form of refactoring (adding vs. removing the auto-incremented columns) is risk free, but neither is a major production either.

mjv
I agree with what you said and that if I do need the PK later I can just generate the column. One point for keeping the PK that stood out to me was, "They are a life saver in case you by mistake (or oversight) populate the database with repeated/wrong data."Although I think the chances of that are on the rare side for me. I would be more likely to accidentally drop the whole table instead, which I've done before. :p
Roger
@Roger: Even without the PK, there are plenty of ways of removing duplicated rows which get accidentally added (and/or recreating the table anew from its current state). Furthermore the argument about accidental rows, can cut the other way: with a semantically-driven PK constraint, duplicated rows do not happen in the first place ;-) I'm a frequent user/practitioner of auto-incremented PKs, but I wanted you to know that it is ok, indeed recommended in some cases to do away with these.
mjv
My argument was probably more strongly worded than it should as I was reacting to the one-sidedness, of the other responses, and the unfairness of downvotes for a response far from wrong...
mjv
Yea, no problem. I think that people actually misunderstood my question. I was only not using the PK for "some" of my tables as stated in my opening post. And of these tables I wanted to remove the PK's not for ALL tables of course.
Roger
+2  A: 

No.

You should keep them; a database always needs something that differentiates a row from another row (a "Key" of some sort).

If you have something that is guaranteed to be unique for each row, then you can use that as a key; otherwise keep the Primary Key and the Auto generated ID.

George Stocker
+2  A: 

I'd personally keep them. They will be especially useful at a later date if you expand the database design and need to reference this table.

Strawberry
Hmm, but if I decide later on I do need an auto increment PK, can't I just create one and it automatically populates my table with incrementing numbers.
Roger