views:

95

answers:

3

In php is there a way to give a unique hash from a string, but that the hash was made up from numbers only?

example:

return md5(234); // returns 098f6bcd4621d373cade4e832627b4f6

but I need

return numhash(234); // returns 00978902923102372190 
(20 numbers only)

the problem here is that I want the hashing to be short.

edit: OK let me explain the back story here. I have a site that has a ID for every registered person, also I need a ID for the person to use and exchange (hence it can't be too long), so far the ID numbering has been 00001, 00002, 00003 etc... 1. this makes some people look more important 2. this reveals application info that I don't want to reveal.

To fix point 1 and 2 I need to "hide" the number while keeping it unique.

A: 

First of all, md5 is basically compromised, so you shouldn't be using it for anything but non-critical hashing. PHP5 has the hash() function, see http://www.php.net/manual/en/function.hash.php.

Setting the last parameter to true will give you a string of binary data. Alternatively, you could split the resulting hexadecimal hash into pieces of 2 characters and convert them to integers individually, but I'd expect that to be much slower.

tdammers
speed is not an issue, the only issue I have is that the num hash is unique not ridiculously long.
YuriKolovsky
+4  A: 

An MD5 or SHA1 hash in PHP returns a hexadecimal number, so all you need to do is convert bases. PHP has a function that can do this for you:

$bignum = hexdec( md5("test") );

or

$bignum = hexdec( sha1("test") );

PHP Manual for hexdec

Since you want a limited size number, you could then use modular division to put it in a range you want.

$smallnum = $bignum % [put your upper bound here]

EDIT

As noted by Artefacto in the comments, using this approach will result in a number beyond the maximum size of an Integer in PHP, and the result after modular division will always be 0. However, taking a substring of the hash that contains the first 16 characters doesn't have this problem. Revised version for calculating the initial large number:

$bignum = hexdec( substr(sha1("test"), 0, 15) );
derekerdmann
what if I limit the 'test' variable to a limited set of numbers?would there be a way of reducing the hash size?
YuriKolovsky
@YuriKolovsky - The final hash size would be determined by whatever upper bound you used for the modular division in the second step. For example, if you want your hashes to all be 5 digits long, then you could use `$smallnum = $bignum % 99999`. This will work regardless of what is put into the initial MD5 or SHA1 hash.
derekerdmann
@derekerdmann this looks like just what I need :D
YuriKolovsky
@YuriKolovsky - Though naturally, using a smaller number will increase the risk of collisions in your hashes. It's up to you to decide how important avoiding collisions is.
derekerdmann
It should be added that md5/sha1 hashes are too long to fit in a php integer. You'll be already losing bytes when you call hexdec. In fact, I fear that for this reason taking the modulo will cause trouble.
Artefacto
Indeed, this makes the modulo return always 0, as the mantissa doesn't have enough digits to reach even the `10^5` digit. Sorry, -1 here.
Artefacto
@Artefacto - Wow, you're right, I hadn't thought of that. I hadn't had a chance to test it out, but I'm surprised the OP didn't notice it either. After some playing, I've found that if you take a substring of the first 16 characters of the hash string, it seems to fit back inside the integer's size limit. Thanks for catching that.
derekerdmann
With 16 hexadecimal characters, each one encoding 4 bits (2^4=16), this means 16 characters will encode 16*4 = 64 bits. PHP integers may only have 32-bits (e.g. on Windows) so that may also not work. I'll remove the downvote, though.
Artefacto
@Artefacto - I just tested it out with XAMPP on Windows 7, and everything seems to be ok. If there are still problems, then taking a smaller substring wouldn't be hard; I just wanted to preserve as much of the hash as I could.
derekerdmann
@derekerdmann It works better, but not exactly. It doesn't fit an integer, so PHP automatically uses a float instead. A float is implemented as a C `double`. The standard gives no guarantees as to the size of it (except its size >= the size of `float` and <= size of `long double`), but in practice they usually are IEEE 754 double-precision doubles, which store a 52-bit mantissa. So you're still losing 12 bits.
Artefacto
@derekerdmann Well, it's apparently more subtle than that because the modulus operator is not defined for double variables in C. I'd have to check the source for PHP for what it does in this case.
Artefacto
@Artefacto - Huh. I had no idea there was that much to it. Probably doesn't matter that much for the OP anymore, but it's still an interesting problem.
derekerdmann
what? does that mean that the generated number might not be unique?? I haven't implemented this yet, and assumed it made a numeric hash.@Artefacto you can also have a go at answering the question :P
YuriKolovsky
@YuriKolovsky Of course it won't be necessarily unique. The only question is how likely it will a collision will be. That said, if you need I think you should use `mt_rand` and if there's a collision, repeat.
Artefacto
+1  A: 

You can try crc32(). See the documentation at: http://php.net/manual/en/function.crc32.php

$checksum = crc32("The quick brown fox jumped over the lazy dog.");
printf("%u\n", $checksum); // prints 2191738434 

With that said, crc should only be used to validate the integrity of data.

David Titarenco
good to know, thanks
YuriKolovsky