views:

216

answers:

5

Autonumber fields (e.g. "identity" in SQL Server) are a common method for providing a unique key for a database table. However, given that they are quite common, at some point in the future we'll be dealing with the problem where they will start reaching their maximum value.

Does anyone know of or have a recommended strategy to avoid this scenario? I expect that many answers will suggest switching to guids, but given that this will take large amount of development (especially where many systems are integrated and share the value) is there another way? Are we heading in a direction where newer hardware/operating systems/databases will simply allow larger and larger values for integers?

+11  A: 

If you really expect your IDs to run out, use bigint. For most practical purposes, it won't ever run out, and if it does, you should probably use a uniqueidentifier.

If you have 3 billion (that means a transaction/cycle on a 3.0GHz processor) transactions per second, it'll take about a century for bigint to run out (even if you put off a bit for the sign).

That said, once, "640K ought to be enough for anybody." :)

Mehrdad Afshari
+1 for the 640k quote
Robin Day
In fact, it's not really the same issue. 640K was a computing limitation and this is a logical limitation in the real world. Obviously, at that time, there were much more than 640K data around.
Mehrdad Afshari
Why not Guid? 64bits might be the next 640K...
Paco
@Paco: As I said in the above comment, it's unlikely since this limitation is not a machine issue, but a logical one. I mentioned uniqueidentifier (GUIDs) in my answer. They have the disadvantage of not being a simple integer. Certainly, if you're using replication or something that inherently...
Mehrdad Afshari
... needs GUIDs, you should use them. Otherwise, if the issue is just overflowing int, I wouldn't go with a uniqueidentifier due to the application complexities it'll probably introduce.
Mehrdad Afshari
+1 for the great calc putting it in perspective
John K
A: 

Is there a possibility to cycle through the numbers that have been deleted from the database? Or are most of the records still alive? Just a thought.

My other idea was Mehrdad's suggestion of switching to bigint

phsr
+2  A: 

See these related questions:

The accepted/top voted answers pretty much cover it.

Tomalak
A: 

Identity columns are typically set to start at 1 and increment by +1. Negative values are just as valid as positive, thus doubling the pool of available identifiers.

Jamie Ide
Yes, and if you're using bigint, that would get you 30 million more years :)
Mehrdad Afshari
We have had processes that frequently re-loaded large datasets (sampling data) and generated new identifiers each time. We aren't in any danger of hitting the limit but I can see how it is possible.
Jamie Ide
A: 

If you may be getting such large data amounts that your IDs max out, you may also want to have support for replication so that you can have multiple somehow synchronized instances of your database.

For such cases, and for cases where you want to avoid having "guessable" IDs (web applications etc.), I'd suggest using Guids (Uniqueidentifiers) with a default of a new Guid as replacement for identity columns.

Because Guids are unique, they allow data to be synchronized properly, even if records were added concurrently to the system.

Lucero

related questions