views:

256

answers:

5

I'm working on an application where I need to generate unique, non-sequential IDs. One of the constraints I have is that they must consist of 3 digits followed by 2 letters (only about 600k IDs). Given my relatively small pool of IDs I was considering simply generating all possible IDs, shuffling them and putting them into a database. Since, internally, I'll have a simple, sequential, ID to use, it'll be easy to pluck them out one at a time & be sure I don't have any repeats.

This doesn't feel like a very satisfying solution. Does anyone out there have a more interesting method of generating unique IDs from a limited pool than this 'lottery' method?

+1  A: 

Depending on what you define as sequential, you could just pick a certain starting point on the letters, such as 'aa', and just loop through the three digits, so it would be: 001aa 002aa 003aa

Once you get to zz then increment the number part.

James Black
+4  A: 

You could generate a random ID conforming to that standard, do a DB select to see if it exists already, then insert it into a DB to note it has been "used". For the first 25% of the life of that scheme (or about 150k entries), it should be relatively fast to generate new random ID's. After that though, it will take longer and longer, and you might as well pre-fill the table to look for free IDs.

Zak
you could encapsulate this in a stored procedure that returned an unused id. That way you're not having repeatedly hammer the database when testing ID's
ninesided
+4  A: 

Use a finite group. Basically, take a 32 or 64-bit integer, and find a large number that is coprime to the maximum value for your integer; call this number M. Then, for all integers n, n * M will result in a unique number that has lots of digits.

This has the advantage that you don't need to pre-fill the database, or run a separate select query -- you can do this all from within one insert statement, by having your n just be an auto-increment, and have a separate ID column that defaults to the n * M.

Don Werve
actually, if you see two of these ID's side by side (or really any 3 or more at any distance) you will just be able to take the gcd of the ID's you see, and be able to exactly predict the next ID. Unfortunately, this solution would have very little entropy. Also, this doesn't conform to the 3 digit, 2 letter spec the OP used
Zak
A: 

Zak's idea is probably the best one here, but there is one security issue if you go with a prefilled table: predictability. You have to determine the implications if someone can reliably predict what the next ID will be.

MadCoder
+3  A: 
Timm