tags:

views:

76

answers:

5

I need PHP function that will create 8 chars long [a-z] hash from any input string. So e.g. when I'll submit "Stack Overflow" it will return e.g. "gdqreaxc" (8 chars [a-z] no numbers allowed)

A: 

how about

substr (preg_replace(md5($mystring), "/[1-9]/", ""), 0, 8 );

you could add a bit more entorpy by doing a

preg_replace($myString, "1", "g");
preg_replace($myString, "2", "h");
preg_replace($myString, "3", "i");

etc instead of stripping the digits.

Zak
this is what I did but I need only [a-z]
chubbyk
this is limited to [a-f]
Zak
the issue with this is what you said. only [a-f]... with over 100.000 items I need to do, it'll probably make some duplicity
chubbyk
A: 
function md5toabc($myMD5)
{
   $newString = "";
   for ($i = 0; $i < 16; $i+=2)
   {
        //add the first val of 0-15 to the second val of 0-15 for a range of 0-30

        $myintval = hexdec(substr($myMD5, $i, $i +1) )  + 
                    hexdec(substr($myMD5, $i+1, $i +2) );
        // mod by 26 and add 97 to get to the lowercase ascii range        
        $newString .= chr(($myintval%26) + 97);
   }
   return $newString;
}

Note this introduces bias to various characters, but do with it what you will. (Like when you roll two dice, the most common value is a 7 combined...) plus the modulo, etc...

Zak
+2  A: 

Perhaps something like:

$hash = substr(strtolower(preg_replace('/[0-9_\/]+/','',base64_encode(sha1($input)))),0,8);

This produces a SHA1 hash, base-64 encodes it (giving us the full alphabet), removes non-alpha chars, lowercases it, and truncates it.

For $input = 'yar!';:

mwinzewn

For $input = 'yar!!';:

yzzhzwjj

So the spread seems pretty good.

Lucas Oman
Seems like there's a small chance that you'll end up with a string less than 8 chars long, if the base64 representation happens to consist of (almost) entirely numbers.
Frank Farmer
That's a valid concern I hadn't considered. Although, on average, 81.25% of a base-64 encoded string would be alpha. In a 56-byte base-64 string from a SHA1 hash, that's an average of 45.5 alpha chars. To get below 8 would probably be several SDs from the mean.
Lucas Oman
You really had me scared and got me curious, so I ran a test with 10,000,000 iterations. I was unable to get any below length of 8. It produced 3 with a length below 40.
Lucas Oman
A: 

one can give you a good a-p{8} (but not a-z) by using and modifying (the output of) a well known algo:

function mini_hash( $string )
{
  $h = hash( 'crc32' , $string );
  for($i=0;$i<8;$i++) {
    $h{$i} = chr(96+hexdec($h{$i}));
  }
  return $h;
}

interesting set of constraints you posted there

nathan
+2  A: 

This function will generate a hash containing evenly distributed characters [a-z]:

function my_hash($string, $length = 8) {

    // Convert to a string which may contain only characters [0-9a-p]
    $hash = base_convert(md5($string), 16, 26);

    // Get part of the string
    $hash = substr($hash, -$length);

    // In rare cases it will be too short, add zeroes
    $hash = str_pad($hash, $length, '0', STR_PAD_LEFT);

    // Convert character set from [0-9a-p] to [a-z]
    $hash = strtr($hash, '0123456789', 'qrstuvwxyz');

    return $hash;
}

By the way, if this is important for you, for 100,000 different strings you'll have ~2% chance of hash collision (for a 8 chars long hash), and for a million of strings this chance rises up to ~90%, if my math is correct.

Alexander Konstantinov