views:

241

answers:

6

I have a 100 million rows, and it's getting too big. I see a lot of gaps. (since I delete, add, delete, add.)

I want to fill these gaps with auto-increment. If I do reset it..is there any harM?

If I do this, will it fill the gaps?:

mysql> ALTER TABLE tbl AUTO_INCREMENT = 1;
+6  A: 

Potentially very dangerous, because you can get a number again that is already in use.

What you propose is resetting the sequence to 1 again. It will just produce 1,2,3,4,5,6,7,.. and so on, regardless of these numbers being in a gap or not.

Update: According to Martin's answer, because of the dangers involved, MySQL will not even let you do that. It will reset the counter to at least the current value + 1.

Think again what real problem the existence of gaps causes. Usually it is only an aesthetic issue.

If the number gets too big, switch to a larger data type (bigint should be plenty).

Thilo
What if the number gets too big?
TIMEX
If the number gets too big, there are two cases (both bad): It is too big because of gaps, or too big even without gaps. In the second case, there is nothing you can do short of re-thinking your schema. In the first case, you could drop and recreate the whole column. But this will re-number every row, which may be out of the question as well.
Thilo
+5  A: 

Chances are you wouldn't gain anything from doing this, and you could easily screw up your application by overwriting rows, since you're going to reset the count for the IDs. (In other words, the next time you insert a row, it'll overwrite the row with ID 1, and then 2, etc.) What will you gain from filling the gaps? If the number gets too big, just change it to a larger number (such as BIGINT).


Edit: I stand corrected. It won't do anything at all, which supports my point that you should just change the type of the column to a larger integer type. The maximum possible value for a BIGINT is 2^64, which is over 18 quintillion. If you only have 100 million rows at the moment, that should be plenty for the foreseeable future.

musicfreak
No, you won't overwrite anything, Only primary keys can auto increment, so you will get a duplicate key error, if you could even do this.
Duncan
+2  A: 

I agree with musicfreak... The maximum for an integer (int(10)) is 4,294,967,295 (unsigned ofcoarse). If you need to go even higher, switching to BIGINT brings you up to 18,446,744,073,709,551,615.

Mike
Unless you're doing a LOT of deletes as well as insertions, I think you'd bring the database to it's knees before you needed to go larger than an unsigned int.
Duncan
+6  A: 

FWIW... According to the MySQL docs applying

ALTER TABLE tbl AUTO_INCREMENT = 1

where tbl contains existing data should have no effect:

To change the value of the AUTO_INCREMENT counter to be used for new rows, do this:

ALTER TABLE t2 AUTO_INCREMENT = value;

You cannot reset the counter to a value less than or equal to any that have already been used. For MyISAM, if the value is less than or equal to the maximum value currently in the AUTO_INCREMENT column, the value is reset to the current maximum plus one. For InnoDB, if the value is less than the current maximum value in the column, no error occurs and the current sequence value is not changed.

I ran a small test that confirmed this for a MyISAM table.

So the answers to you questions are: no harm, and no it won't fill the gaps. As other responders have said: a change of data type looks like the least painful choice.

martin clayton
+1  A: 

Since you can't change the next auto-increment value, you have other options. The datatype switch could be done, but it seems a little unsettling to me since you don't actually have that many rows. You'd have to make sure your code can handle IDs that large, which may or may not be tough for you.

Are you able to do much downtime? If you are, there are two options I can think of:

  1. Dump/reload the data. You can do this so it won't keep the ID numbers. For example you could use a SELECT ... INTO to copy the data, sans-IDs, to a new table with identical DDL. Then you drop the old table and rename the new table to the old name. Depending on how much data there is, this could take a noticeable about of time (and temporary disk space).

  2. You could make a little program to issue UPDATE statements to change the IDs. If you let that run slowly, it would "defragment" your IDs over time. Then you could temporarily stop the inserts (just a minute or two), update the last IDs, then restart it. After updating the last IDs you can change the AUTO_INCREMENT value to be the next number and your hole will be gone. This shouldn't cause any real downtime (at least on InnoDB), but it could take quite a while depending on how aggressive your program is.

Of course, both of these ignore referential integrity. I'm assuming that's not a problem (log statements that aren't used as foreign keys, or some such).

MBCook
A: 

Does it really matter if there are gaps?

If you really want to go back and fill them, you can always turn off auto increment, and manually scan for the next available id every time you want to insert a row -- remembering to lock the table to avoid race conditions, of course. But it's a lot of work to do for not much gain.

Do you really need a surrogate key anyway? Depending on the data (you haven't mentioned a schema) you can probably find a natural key.

Duncan