views:

58

answers:

4

Hi,

I want to generate a unique id which will be used in URLs to identify a specific resource. In principle it is the same as pastebin.com etc. does.

The id and resource is not very secret but I want it to be so you just can't decrement a id and then get another users resource. I´m thinking of a CHAR(8) which will look nice in a URL and still be large enough to reduce the chance of guesses. But how do I generate this? For an INT, you can use auto_increment and primary key to insure the uniqueness.

But if I do the following in order

  1. Generate a CHAR(8) in my application
  2. Insure that this ID doesn't exists.
  3. If it not exists, store, else goto 1.

I have to wrap 2. and 3. in a atomic transaction.

But is there a better way? or shouldn't I care about the the check (2.) because a clash doesn't occur regularly. I use MySql and .Net (C#) if that helps. Is it possible to somehow 'encrypt' a auto-incremented int as the text-id and decrypt it again in precisely 8 (or 10) characters.

I have read http://stackoverflow.com/questions/529647/need-a-smaller-alternative-to-guid-for-db-id-but-still-unique-and-random-for-url which was useful, but the use of GUID is not supported in MySql (as far as I know). But a comment on the quality of the LongToShortCode method in the thread would also be appreciated.

Note: the resources can't be changed, only viewed.

Best regards, Lasse

A: 

The easyish way of doing that pseudo-atomically is to

  1. generate the random string
  2. store the string (in effect reserving it)
  3. check if another one exists
  4. if another one exists, remove the one you just made, and return to step 1

Collisions can still happen, but when they do it causes both threads to try again, which in this case isn't a problem.

EDIT: I would suggest taking the first few characters of a cryptographic hash or something for your generation function, but it doesn't really matter.

zebediah49
What do you mean by "storing the string"? if I store it, it does exists and a collision in the storing can appear.
lasseespeholt
+1  A: 

MySql implements the UUID. Which appears to be a GUID with a different name. So that option is still available for you.

If you are still going to use char(8), then you do need to worry about uniqueness of your ID, simply because if you're looking at served URLs, you may not know a violation has occurred until people start reporting problems.

AllenG
The UUID/GUID is too large, and is difficult the strip down (look the other stackoverflow thread). The resource can't be changed, only viewed, so that will not be a problem. In essence, I will provide something similar to e.g. pastebin.com
lasseespeholt
+1  A: 

You could use an int identity and then encrypt/decrypt it before using it, probably not the best idea under heavy load though.

Paul Creasey
Do you have a link to a proper implemented algorithm to convert a integer to a text of length precisely X (8 in my case)?
lasseespeholt
A: 

I think I'll do it like this: A 8 character text id can store a number up to 64^8 = 2^48.

I will then use two columns:

  • ID, INT 2^32 auto-increment
  • Rand, INT 2^16

Then, when I add a row, I will generate a random 2^16 integer and put it in a new row. The text id is then simply generated from from the two numbers combined. And retrieval is easy two - just splitting it up and a simple lookup in the database. Ridiculous simple solution which should eliminate row clashes and be random enough (2^16) to reduce guesses.

Feedback on this approach will be appreciated.

lasseespeholt