tags:

views:

57

answers:

2

I have designed an user interface for a tool where the user needs to enter a "Name" which is maximum 300 characters long, and the tool generates a text file ("Name".txt) which is then uploaded to a "server" (Mainframe and Unix). I want to shorten the 300 character string into a uniquely identifiable 8 character string (because of issues primarily in mainframe), something like a tinyurl using some kind of hashing algorithm. I found a SHA1 implementation but the resulting string is 40 characters long. Can someone suggest a VBA implementation for the algorithm?

The requirement of the resulting string being 8 characters long is strict - my guess is that it should be doable given that we have a limitation on the size of the input string.

+3  A: 

You can just take the first eight characters of the SHA1 Hash.

These hashes (like the original 40 char version) are not guaranteed to be unique, though. If you need uniqueness, you probably need to store each name together with its short version somewhere and issue only short names not used so far. (That's what tinyurl does.)

Jens
Though I don't have any idea of the statistical mathematics involved, but simply cutting the first eight characters off will increase the chance of collisions greatly.
MvanGeest
@MvanGeest: Yes, it will, but it might still be good enough. If you take 8 characters of a base64 encoded hash, you have 64^8 = 2.8E14 different combinations. If the hashes are fairly random, the chance of a collision reaches 50% at about 20 million entries, with 2 million entries you are under 1% chance.
Jens
+1  A: 

I think Jens's idea should work just fine. If truncating SHA-1 hash isn't your thing, you could use a CRC-32 (32 bits ~ 8 ascii chars from 0..f). (you can try using this example) CRC-32 is less safe as far as collisions are concerned, but it's up to you eventually.

Yoni H