views:

1089

answers:

4

I need to generate unique record id for the given unique string.

I tried using uuid format which seems to be good.

But we feel that is lengthly.

so we need to cutdown the uuid string 9f218a38-12cd-5942-b877-80adc0589315 to smaller. By removing '-' we can save 4 chars. What is the safest part to remove from uuid? We don't need universally unique id but we like to use uuid as a source but cut down strings.

We need unique id specific to site/database (SQL Server/ADO.NET Data services).

Any idea or sample from any language is fine

Thanks in advance

+5  A: 

Why not instead just convert it to a base 64 string? You can cut it down to 22 characters that way.

http://stackoverflow.com/questions/772802/storing-uuid-as-base64-string

Spencer Ruport
+3  A: 

A UUID has 128 bits. Have you considered doing a CRC of it? That could get it down to 16 or 32 bits easily, and would use all the original information. If a CRC isn't good enough, you could always use the first few bytes of a proper hash (SHA256, for example).

If you really want to just cut down the UUID, the format of it is described in RFC 4122. You should be able to figure out what parts your implementation doesn't need from that.

Head Geek
A CRC wouldn't be unique at all.
Glenn Maynard
Depends on how much uniqueness he needs. That's why I suggested both CRC and an alternative.
Head Geek
A: 

An UUID provides (almost) 128 bits of uniqueness. You may shorten it to 16 binary bytes, or 22 base64-encoded characters. I wouldn't recommend removing any part of a UUID, otherwise, it just loses its sense. UUIDs were designed so that all the 128 bits have meaning. If you want less than that, you should use some other schema.

For example, if you could guarantee that only version 4 UUIDs are used, then you could take just the first 32 bits, or just the last 32 bits. You lose uniqueness, but you have pretty random numbers. Just avoid the bits that are fixed (version and variant).

But if you can't guarantee that, you will have real problems. For version 1 UUIDs, the first bits will not be unique for UUIDs generated in the same day, and the last bits will not be unique for UUIDs generated in the same system. Even if you CRC the UUID, it is not guaranteed that you will have 16 or 32 bits of uniqueness.

In this case, just use some other scheme. Generate a 32-bit random number using the system random number generator and use that as your unique ID. Don't rely on UUIDs if you intend on stripping its length.

Juliano
A: 

The UUID is 128 bits or 16 bytes. With no encoding, you could get it as low as 16 bytes. UUIDs are commonly written in hexadecimal, making them 32 byte readable strings. With other encodings, you get different results:

  1. base-64 turns 3 8-bit bytes into 4 6-bit characters, so 16 bytes of data becomes 22 characters long
  2. base-85 turns 4 8-bit bytes into 5 6.4-bit characters, so 16 bytes of data becomes 20 characters long

It all depends on if you want readable strings and how standard/common an encoding you want to use.

hughdbrown