tags:

views:

334

answers:

2

I know that this might be a long stretch, but could someone possibly tell me where my current implementation of the MD5 algorithm in PHP is going wrong? I just can't seem to work out what's wrong with it.

It returns a 32 character hex string (25% of the time it produces a string less than 32 characters though), but it isn't producing the same 32 chars that the built-in MD5 function does.

Thank you very much.


<?php

function MD($string){
$a = "67452301";
$b = "EFCDAB89";
$c = "98BADCFE";
$d = "10325476";

$words = init($string);

for($i = 0; $i <= count($words)/16-1; $i++){
    $A = $a;
    $B = $b;
    $C = $c;
    $D = $d;

    /* ROUND 1 */
    FF ($A, $B, $C, $D, $words[0 + ($i * 16)], 7, "d76aa478"); 
    FF ($D, $A, $B, $C, $words[1 + ($i * 16)], 12, "e8c7b756"); 
    FF ($C, $D, $A, $B, $words[2 + ($i * 16)], 17, "242070db"); 
    FF ($B, $C, $D, $A, $words[3 + ($i * 16)], 22, "c1bdceee"); 
    FF ($A, $B, $C, $D, $words[4 + ($i * 16)], 7, "f57c0faf"); 
    FF ($D, $A, $B, $C, $words[5 + ($i * 16)], 12, "4787c62a"); 
    FF ($C, $D, $A, $B, $words[6 + ($i * 16)], 17, "a8304613"); 
    FF ($B, $C, $D, $A, $words[7 + ($i * 16)], 22, "fd469501"); 
    FF ($A, $B, $C, $D, $words[8 + ($i * 16)], 7, "698098d8"); 
    FF ($D, $A, $B, $C, $words[9 + ($i * 16)], 12, "8b44f7af"); 
    FF ($C, $D, $A, $B, $words[10 + ($i * 16)], 17, "ffff5bb1"); 
    FF ($B, $C, $D, $A, $words[11 + ($i * 16)], 22, "895cd7be"); 
    FF ($A, $B, $C, $D, $words[12 + ($i * 16)], 7, "6b901122"); 
    FF ($D, $A, $B, $C, $words[13 + ($i * 16)], 12, "fd987193"); 
    FF ($C, $D, $A, $B, $words[14 + ($i * 16)], 17, "a679438e"); 
    FF ($B, $C, $D, $A, $words[15 + ($i * 16)], 22, "49b40821"); 

    /* ROUND 2 */
    GG ($A, $B, $C, $D, $words[1 + ($i * 16)], 5, "f61e2562"); 
    GG ($D, $A, $B, $C, $words[6 + ($i * 16)], 9, "c040b340"); 
    GG ($C, $D, $A, $B, $words[11 + ($i * 16)], 14, "265e5a51"); 
    GG ($B, $C, $D, $A, $words[0 + ($i * 16)], 20, "e9b6c7aa"); 
    GG ($A, $B, $C, $D, $words[5 + ($i * 16)], 5, "d62f105d"); 
    GG ($D, $A, $B, $C, $words[10 + ($i * 16)], 9, "02441453"); 
    GG ($C, $D, $A, $B, $words[15 + ($i * 16)], 14, "d8a1e681"); 
    GG ($B, $C, $D, $A, $words[4 + ($i * 16)], 20, "e7d3fbc8"); 
    GG ($A, $B, $C, $D, $words[9 + ($i * 16)], 5, "21e1cde6"); 
    GG ($D, $A, $B, $C, $words[14 + ($i * 16)], 9, "c33707d6"); 
    GG ($C, $D, $A, $B, $words[3 + ($i * 16)], 14, "f4d50d87"); 
    GG ($B, $C, $D, $A, $words[8 + ($i * 16)], 20, "455a14ed"); 
    GG ($A, $B, $C, $D, $words[13 + ($i * 16)], 5, "a9e3e905"); 
    GG ($D, $A, $B, $C, $words[2 + ($i * 16)], 9, "fcefa3f8"); 
    GG ($C, $D, $A, $B, $words[7 + ($i * 16)], 14, "676f02d9"); 
    GG ($B, $C, $D, $A, $words[12 + ($i * 16)], 20, "8d2a4c8a"); 

    /* ROUND 3 */
    HH ($A, $B, $C, $D, $words[5 + ($i * 16)], 4, "fffa3942"); 
    HH ($D, $A, $B, $C, $words[8 + ($i * 16)], 11, "8771f681"); 
    HH ($C, $D, $A, $B, $words[11 + ($i * 16)], 16, "6d9d6122"); 
    HH ($B, $C, $D, $A, $words[14 + ($i * 16)], 23, "fde5380c"); 
    HH ($A, $B, $C, $D, $words[1 + ($i * 16)], 4, "a4beea44"); 
    HH ($D, $A, $B, $C, $words[4 + ($i * 16)], 11, "4bdecfa9"); 
    HH ($C, $D, $A, $B, $words[7 + ($i * 16)], 16, "f6bb4b60"); 
    HH ($B, $C, $D, $A, $words[10 + ($i * 16)], 23, "bebfbc70"); 
    HH ($A, $B, $C, $D, $words[13 + ($i * 16)], 4, "289b7ec6"); 
    HH ($D, $A, $B, $C, $words[0 + ($i * 16)], 11, "eaa127fa"); 
    HH ($C, $D, $A, $B, $words[3 + ($i * 16)], 16, "d4ef3085"); 
    HH ($B, $C, $D, $A, $words[6 + ($i * 16)], 23, "04881d05"); 
    HH ($A, $B, $C, $D, $words[9 + ($i * 16)], 4, "d9d4d039"); 
    HH ($D, $A, $B, $C, $words[12 + ($i * 16)], 11, "e6db99e5"); 
    HH ($C, $D, $A, $B, $words[15 + ($i * 16)], 16, "1fa27cf8"); 
    HH ($B, $C, $D, $A, $words[2 + ($i * 16)], 23, "c4ac5665"); 

    /* ROUND 4 */
    II ($A, $B, $C, $D, $words[0 + ($i * 16)], 6, "f4292244"); 
    II ($D, $A, $B, $C, $words[7 + ($i * 16)], 10, "432aff97"); 
    II ($C, $D, $A, $B, $words[14 + ($i * 16)], 15, "ab9423a7"); 
    II ($B, $C, $D, $A, $words[5 + ($i * 16)], 21, "fc93a039"); 
    II ($A, $B, $C, $D, $words[12 + ($i * 16)], 6, "655b59c3"); 
    II ($D, $A, $B, $C, $words[3 + ($i * 16)], 10, "8f0ccc92"); 
    II ($C, $D, $A, $B, $words[10 + ($i * 16)], 15, "ffeff47d"); 
    II ($B, $C, $D, $A, $words[1 + ($i * 16)], 21, "85845dd1"); 
    II ($A, $B, $C, $D, $words[8 + ($i * 16)], 6, "6fa87e4f"); 
    II ($D, $A, $B, $C, $words[15 + ($i * 16)], 10, "fe2ce6e0"); 
    II ($C, $D, $A, $B, $words[6 + ($i * 16)], 15, "a3014314"); 
    II ($B, $C, $D, $A, $words[13 + ($i * 16)], 21, "4e0811a1"); 
    II ($A, $B, $C, $D, $words[4 + ($i * 16)], 6, "f7537e82"); 
    II ($D, $A, $B, $C, $words[11 + ($i * 16)], 10, "bd3af235"); 
    II ($C, $D, $A, $B, $words[2 + ($i * 16)], 15, "2ad7d2bb"); 
    II ($B, $C, $D, $A, $words[9 + ($i * 16)], 21, "eb86d391"); 

    addVars($a, $b, $c, $d, $A, $B, $C, $D);    
}
    $MD5 = $a.$b.$c.$d;
    return $MD5;
}

/* General functions */

function hexbin($str){
    $hexbinmap = array("0" => "0000"
                     , "1" => "0001"
                     , "2" => "0010"
                     , "3" => "0011"
                     , "4" => "0100"
                     , "5" => "0101"
                     , "6" => "0110"
                     , "7" => "0111"
                     , "8" => "1000"
                     , "9" => "1001"
                     , "A" => "1010"
                     , "a" => "1010"
                     , "B" => "1011"
                     , "b" => "1011"
                     , "C" => "1100"
                     , "c" => "1100"
                     , "D" => "1101"
                     , "d" => "1101"
                     , "E" => "1110"
                     , "e" => "1110"
                     , "F" => "1111"
                     , "f" => "1111");                    
    $bin = "";
    for ($i = 0; $i < strlen($str); $i++)
    {
        $bin .= $hexbinmap[$str[$i]];
    }
    $bin = ltrim($bin, '0'); 
    // echo "Original: ".$str."  New: ".$bin."<br />";
    return $bin;
}

function strhex($str){
    $hex = "";
    for ($i = 0; $i < strlen($str); $i++)
    {
        $hex = $hex.dechex(ord($str[$i]));
    }
    return $hex;
}


/* MD5-specific functions */

function init($string){
    $len = strlen($string);
    $hex = strhex($string); // convert ascii string to hex
    $bin = hexbin($hex); // convert hex string to bin
    $padded = pad($bin);
    $padded = pad($padded, 1, $len);
    $block = str_split($padded, 32);
    return $block;
}

function pad($bin, $type=0, $len = 0){
    if($type == 0){
        $bin = $bin."1";
        $buff = strlen($bin) % 512;
        if($buff != 448){
            while(strlen($bin) % 512 != 448){
                $bin = $bin."0";
            }
        }
    }
    // append length (b) of string to latter 64 bits
    elseif($type == 1){
        $bLen = decbin($len);
        if(strlen($bLen) > 64){
            $words = truncate64($bLen);
            $bin .= $words[1].$words[0];
        }
        else{
            while(strlen($bLen) < 64){
                $bLen .= "0";
            }
            $words = str_split ($bLen, 32);
            $bin .= $words[1].$words[0];
        }

    }
    return $bin;
}

function truncate64($string){
    $trunc = substr($string, strlen($string) - 64, 64);
    $trunc = str_split ($trunc, 32);
    return $trunc;
}


/* MD5 base functions */

function F($X, $Y, $Z){
    $X = hexbin($X);
    $Y = hexbin($Y);
    $Z = hexbin($Z);
    $calc = ($X & $Y) | ((~ $X) & $Z); // X AND Y OR NOT X AND Z
    $calc = bindec($calc);
    return  $calc; 
}

function G($X, $Y, $Z){
    $X = hexbin($X);
    $Y = hexbin($Y);
    $Z = hexbin($Z);
    $calc = ($X & $Z) | ($Y & (~ $Z)) ; // X AND Z OR Y AND NOT Z
    $calc = bindec($calc);
    return  $calc; 
}

function H($X, $Y, $Z){
    $X = hexbin($X);
    $Y = hexbin($Y);
    $Z = hexbin($Z);
    $calc = $X ^ $Y ^ $Z; // X XOR Y XOR Z
    $calc = bindec($calc);
    return  $calc; 
}

function I($X, $Y, $Z){
    $X = hexbin($X);
    $Y = hexbin($Y);
    $Z = hexbin($Z);
    $calc = $Y ^ ($X | (~ $Z)) ; // Y XOR (X OR NOT Z)
    $calc = bindec($calc);
    return  $calc; 
}

/* MD5 round functions */

/*
$A - hex, $B - hex, $C - hex, $D - hex (F - dec)
$M - binary
$s - decimal
$t - hex
*/
function FF(&$A, $B, $C, $D, $M, $s, $t){
    $A = hexdec($A);
    $t = hexdec($t);
    $M = bindec($M);
    $A = hexdec($B) + (($A + F($B, $C, $D) + $M + $t)); //decimal
    $A = rotate($A, $s);
}

function GG(&$A, $B, $C, $D, $M, $s, $t){
    $A = hexdec($A);
    $t = hexdec($t);
    $M = bindec($M);
    $A = hexdec($B) + (($A + G($B, $C, $D) + $M + $t)); //decimal
    $A = rotate($A, $s);
}

function HH(&$A, $B, $C, $D, $M, $s, $t){
    $A = hexdec($A);
    $t = hexdec($t);
    $M = bindec($M);
    $A = hexdec($B) + (($A + H($B, $C, $D) + $M + $t)); //decimal
    $A = rotate($A, $s);
}

function II(&$A, $B, $C, $D, $M, $s, $t){
    $A = hexdec($A);
    $t = hexdec($t);
    $M = bindec($M);
    $A = hexdec($B) + (($A + I($B, $C, $D) + $M + $t)); //decimal
    $A = rotate($A, $s);
}

// shift
function rotate($decimal, $bits) { //returns hex
  $binary = decbin($decimal);
  $shifted = substr($binary, $bits).substr($binary, 0, $bits);
  $hexshift = base_convert($shifted, 2, 16);
  return $hexshift;
}

function addVars(&$a, &$b, &$c, &$d, $A, $B, $C, $D){
    $A = hexdec($A);
    $B = hexdec($B);
    $C = hexdec($C);
    $D = hexdec($D);
    $aa = hexdec($a);
    $bb = hexdec($b);
    $cc = hexdec($c);
    $dd = hexdec($d);

    $aa = $aa + $A;
    $bb = $bb + $A;
    $cc = $cc + $A;
    $dd = $dd + $A;

    $a = dechex($aa);
    $b = dechex($bb);
    $c = dechex($cc);
    $d = dechex($dd);
}

?>
+5  A: 

Good on you for trying! I've had a similar experience, one time long ago I implemented an MD5 algorithm in Tcl. The best way I found to debug it was to trace through it line by line, knowing what operation should be performed, and making sure by hand calculation whether the correct operation was indeed performed.

There's no easy answer to this, and it's impossible to tell what might be wrong from the code you posted without detailed analysis.

(I'm assuming that you already know about the standard md5() function and you're doing this for learning purposes.)

Greg Hewgill
Thanks for the suggestion - it suprising how much you overlook the "low-tech" ways of doing this type of thing when they're probably the best methods.
blake
And yes, I know about the md5 function ;)
blake
+14  A: 

For some reason this question didn't leave me alone, so I went through your code and fixed the bugs until it worked:

Before you go through this, I have two pieces of advice:

  1. Don't convert back and forth between int values and hex/bin representations; convert to int values before any processing is done; makes the code much more readable.

  2. Use call_user_func() and implement the GG -> G, II -> I functions only once.

Also, there is one subtle bug remaining; an input string made up exclusively of zero characters will not be encoded correctly; I'll leave fixing that as an exercise for the reader :-).

  • In init():

What is appended is the length of the unpadded message in number of bits, not number of characters:

- $len = strlen($string)
+ $len = strlen($string) * 8;
  $hex = strhex($string); // convert ascii string to hex

also, you have to pad what you get from hexbin otherwise the subsequent call to str_split will get the alignment wrong:

- $bin = hexbin($hex);
+ $bin = leftpad(hexbin($hex), $len); // convert hex string to bin
  $block = str_split($padded, 32);

also, the byte order is little endian:

+ foreach ($block as &$b) {
+     $b = implode('', array_reverse(str_split($b, 8)));
+ }
  • In strhex():

There are a lot of padding errors like this; dechex(ord("\1")) is '1' not '01':

- $hex = $hex.dechex(ord($str[$i]));
+ $hex = $hex.leftpad(dechex(ord($str[$i])), 2);
  • In pad():

The byte-order is litte endian (you're splitting into 32-bit words) and truncate64() is completely off the map :-):

-   $bLen = decbin($len);
-  if(strlen($bLen) > 64){
-    $words = truncate64($bLen);
-    $bin .= $words[1].$words[0];
-  }
-  else{
-      while(strlen($bLen) < 64){
-        $bLen .= "0";
-      }
-    $words = str_split ($bLen, 32);
-    $bin .= $words[1].$words[0];
-  }
+ $bLen = leftpad(decbin($len), 64);
+ $bin .= implode('', array_reverse(str_split($bLen, 8)));
  • In F(), G(), H(), I():

Bitwise operators can't be used on binary strings:

- $X = hexbin($X);
- $Y = hexbin($Y);
- $Z = hexbin($Z);
- $calc = ...
- $calc = bindec($calc);
+ $X = hexdec($X);
+ $Y = hexdec($Y);
+ $Z = hexdec($Z);
+ $calc = ...
  • In FF(), GG(), HH(), II():

You are adding $B before rotation, it must be added afterwards; also, since you are going back and forth between string and int representations and PHP_INT_SIZE may be greater than 4 (e.g. on 64-bit platforms), you have to make sure you only use the lower 32-bits:

- $A = hexdec($B) + (($A + H($B, $C, $D) + $M + $t)); //decimal
+ $A = ($A + H($B, $C, $D) + $M + $t) & 0xffffffff; //decimal
+ $A = rotate($A, $s);
+ $A = dechex((hexdec($B) + hexdec($A)) & 0xffffffff);
  • In addVars():

$A is repeated for each addition, probably a copy-paste artifact :-):

- $aa = $aa + $A;
- $bb = $bb + $A;
- $cc = $cc + $A;
- $dd = $dd + $A;
+ $aa = ($aa + $A) & 0xffffffff;
+ $bb = ($bb + $B) & 0xffffffff;
+ $cc = ($cc + $C) & 0xffffffff;
+ $dd = ($dd + $D) & 0xffffffff;
  • In rotate():

There is a padding bug (again) in your rotate function. Just throw it away and replace it with:

+ function rotate ($decimal, $bits) { //returns hex
+     return dechex((($decimal << $bits) |  ($decimal >> (32 - $bits))) & 0xffffffff);
+ }
  • In MD():

Last but not least, you have to convert to little endian again:

- $MD5 = $a.$b.$c.$d;
+ $MD5 = '';
+ foreach (array($a, $b, $c, $d) as $x) {
+     $MD5 .= implode('', array_reverse(str_split(leftpad($x, 8), 2)));
+ }

The missing leftpad() function:

+ function leftpad($needs_padding, $alignment)
+ {
+   if (strlen($needs_padding) % $alignment) {
+       $pad_amount    = $alignment - strlen($needs_padding) % $alignment;
+       $left_pad      = implode('', array_fill(0, $pad_amount, '0'));
+       $needs_padding = $left_pad . $needs_padding;
+   }
+   return $needs_padding;
+ }

Full edited source:

<?php

function MD($string){
$a = "67452301";
$b = "EFCDAB89";
$c = "98BADCFE";
$d = "10325476";

$words = init($string);

for($i = 0; $i <= count($words)/16-1; $i++){
        $A = $a;
        $B = $b;
        $C = $c;
        $D = $d;


        /* ROUND 1 */
        FF ($A, $B, $C, $D, $words[0 + ($i * 16)], 7, "d76aa478"); 
        FF ($D, $A, $B, $C, $words[1 + ($i * 16)], 12, "e8c7b756"); 
        FF ($C, $D, $A, $B, $words[2 + ($i * 16)], 17, "242070db"); 
        FF ($B, $C, $D, $A, $words[3 + ($i * 16)], 22, "c1bdceee"); 
        FF ($A, $B, $C, $D, $words[4 + ($i * 16)], 7, "f57c0faf"); 
        FF ($D, $A, $B, $C, $words[5 + ($i * 16)], 12, "4787c62a"); 
        FF ($C, $D, $A, $B, $words[6 + ($i * 16)], 17, "a8304613"); 
        FF ($B, $C, $D, $A, $words[7 + ($i * 16)], 22, "fd469501"); 
        FF ($A, $B, $C, $D, $words[8 + ($i * 16)], 7, "698098d8"); 
        FF ($D, $A, $B, $C, $words[9 + ($i * 16)], 12, "8b44f7af"); 
        FF ($C, $D, $A, $B, $words[10 + ($i * 16)], 17, "ffff5bb1"); 
        FF ($B, $C, $D, $A, $words[11 + ($i * 16)], 22, "895cd7be"); 
        FF ($A, $B, $C, $D, $words[12 + ($i * 16)], 7, "6b901122"); 
        FF ($D, $A, $B, $C, $words[13 + ($i * 16)], 12, "fd987193"); 
        FF ($C, $D, $A, $B, $words[14 + ($i * 16)], 17, "a679438e"); 
        FF ($B, $C, $D, $A, $words[15 + ($i * 16)], 22, "49b40821"); 

        /* ROUND 2 */
        GG ($A, $B, $C, $D, $words[1 + ($i * 16)], 5, "f61e2562"); 
        GG ($D, $A, $B, $C, $words[6 + ($i * 16)], 9, "c040b340"); 
        GG ($C, $D, $A, $B, $words[11 + ($i * 16)], 14, "265e5a51"); 
        GG ($B, $C, $D, $A, $words[0 + ($i * 16)], 20, "e9b6c7aa"); 
        GG ($A, $B, $C, $D, $words[5 + ($i * 16)], 5, "d62f105d"); 
        GG ($D, $A, $B, $C, $words[10 + ($i * 16)], 9, "02441453"); 
        GG ($C, $D, $A, $B, $words[15 + ($i * 16)], 14, "d8a1e681"); 
        GG ($B, $C, $D, $A, $words[4 + ($i * 16)], 20, "e7d3fbc8"); 
        GG ($A, $B, $C, $D, $words[9 + ($i * 16)], 5, "21e1cde6"); 
        GG ($D, $A, $B, $C, $words[14 + ($i * 16)], 9, "c33707d6"); 
        GG ($C, $D, $A, $B, $words[3 + ($i * 16)], 14, "f4d50d87"); 
        GG ($B, $C, $D, $A, $words[8 + ($i * 16)], 20, "455a14ed"); 
        GG ($A, $B, $C, $D, $words[13 + ($i * 16)], 5, "a9e3e905"); 
        GG ($D, $A, $B, $C, $words[2 + ($i * 16)], 9, "fcefa3f8"); 
        GG ($C, $D, $A, $B, $words[7 + ($i * 16)], 14, "676f02d9"); 
        GG ($B, $C, $D, $A, $words[12 + ($i * 16)], 20, "8d2a4c8a"); 

        /* ROUND 3 */
        HH ($A, $B, $C, $D, $words[5 + ($i * 16)], 4, "fffa3942"); 
        HH ($D, $A, $B, $C, $words[8 + ($i * 16)], 11, "8771f681"); 
        HH ($C, $D, $A, $B, $words[11 + ($i * 16)], 16, "6d9d6122"); 
        HH ($B, $C, $D, $A, $words[14 + ($i * 16)], 23, "fde5380c"); 
        HH ($A, $B, $C, $D, $words[1 + ($i * 16)], 4, "a4beea44"); 
        HH ($D, $A, $B, $C, $words[4 + ($i * 16)], 11, "4bdecfa9"); 
        HH ($C, $D, $A, $B, $words[7 + ($i * 16)], 16, "f6bb4b60"); 
        HH ($B, $C, $D, $A, $words[10 + ($i * 16)], 23, "bebfbc70"); 
        HH ($A, $B, $C, $D, $words[13 + ($i * 16)], 4, "289b7ec6"); 
        HH ($D, $A, $B, $C, $words[0 + ($i * 16)], 11, "eaa127fa"); 
        HH ($C, $D, $A, $B, $words[3 + ($i * 16)], 16, "d4ef3085"); 
        HH ($B, $C, $D, $A, $words[6 + ($i * 16)], 23, "04881d05"); 
        HH ($A, $B, $C, $D, $words[9 + ($i * 16)], 4, "d9d4d039"); 
        HH ($D, $A, $B, $C, $words[12 + ($i * 16)], 11, "e6db99e5"); 
        HH ($C, $D, $A, $B, $words[15 + ($i * 16)], 16, "1fa27cf8"); 
        HH ($B, $C, $D, $A, $words[2 + ($i * 16)], 23, "c4ac5665"); 

        /* ROUND 4 */
        II ($A, $B, $C, $D, $words[0 + ($i * 16)], 6, "f4292244"); 
        II ($D, $A, $B, $C, $words[7 + ($i * 16)], 10, "432aff97"); 
        II ($C, $D, $A, $B, $words[14 + ($i * 16)], 15, "ab9423a7"); 
        II ($B, $C, $D, $A, $words[5 + ($i * 16)], 21, "fc93a039"); 
        II ($A, $B, $C, $D, $words[12 + ($i * 16)], 6, "655b59c3"); 
        II ($D, $A, $B, $C, $words[3 + ($i * 16)], 10, "8f0ccc92"); 
        II ($C, $D, $A, $B, $words[10 + ($i * 16)], 15, "ffeff47d"); 
        II ($B, $C, $D, $A, $words[1 + ($i * 16)], 21, "85845dd1"); 
        II ($A, $B, $C, $D, $words[8 + ($i * 16)], 6, "6fa87e4f"); 
        II ($D, $A, $B, $C, $words[15 + ($i * 16)], 10, "fe2ce6e0"); 
        II ($C, $D, $A, $B, $words[6 + ($i * 16)], 15, "a3014314"); 
        II ($B, $C, $D, $A, $words[13 + ($i * 16)], 21, "4e0811a1"); 
        II ($A, $B, $C, $D, $words[4 + ($i * 16)], 6, "f7537e82"); 
        II ($D, $A, $B, $C, $words[11 + ($i * 16)], 10, "bd3af235"); 
        II ($C, $D, $A, $B, $words[2 + ($i * 16)], 15, "2ad7d2bb"); 
        II ($B, $C, $D, $A, $words[9 + ($i * 16)], 21, "eb86d391"); 

        addVars($a, $b, $c, $d, $A, $B, $C, $D);        
}
  $MD5 = '';
  foreach (array($a, $b, $c, $d) as $x) {
      $MD5 .= implode('', array_reverse(str_split(leftpad($x, 8), 2)));
  }

        return $MD5;
}

/* General functions */

function hexbin($str){
        $hexbinmap = array("0" => "0000"
                                                , "1" => "0001"
                                                , "2" => "0010"
                                                , "3" => "0011"
                                                , "4" => "0100"
                                                , "5" => "0101"
                                                , "6" => "0110"
                                                , "7" => "0111"
                                                , "8" => "1000"
                                                , "9" => "1001"
                                                , "A" => "1010"
                                                , "a" => "1010"
                                                , "B" => "1011"
                                                , "b" => "1011"
                                                , "C" => "1100"
                                                , "c" => "1100"
                                                , "D" => "1101"
                                                , "d" => "1101"
                                                , "E" => "1110"
                                                , "e" => "1110"
                                                , "F" => "1111"
                                                , "f" => "1111");

        $bin = "";
    for ($i = 0; $i < strlen($str); $i++)
    {
        $bin .= $hexbinmap[$str[$i]];
    }
    $bin = ltrim($bin, '0'); 
        // echo "Original: ".$str."  New: ".$bin."<br />";
    return $bin;
}

function strhex($str){
    $hex = "";
    for ($i = 0; $i < strlen($str); $i++)
    {
        $hex = $hex.leftpad(dechex(ord($str[$i])), 2);
    }
    return $hex;
}


/* MD5-specific functions */

function init($string){
        $len = strlen($string) * 8;
        $hex = strhex($string); // convert ascii string to hex
        $bin = leftpad(hexbin($hex), $len); // convert hex string to bin
        $padded = pad($bin);
        $padded = pad($padded, 1, $len);
        $block = str_split($padded, 32);

        foreach ($block as &$b) {
            $b = implode('', array_reverse(str_split($b, 8)));
        }

        return $block;
}

function pad($bin, $type=0, $len = 0){
        if($type == 0){
        $bin = $bin."1";
        $buff = strlen($bin) % 512;
        if($buff != 448){
                while(strlen($bin) % 512 != 448){

                        $bin = $bin."0";
                }
        }
        }
        // append length (b) of string to latter 64 bits
        elseif($type == 1){
            $bLen = leftpad(decbin($len), 64);
            $bin .= implode('', array_reverse(str_split($bLen, 8)));
        }
        return $bin;
}

/* MD5 base functions */

function F($X, $Y, $Z){
        $X = hexdec($X);
        $Y = hexdec($Y);
        $Z = hexdec($Z);
        $calc = (($X & $Y) | ((~ $X) & $Z)); // X AND Y OR NOT X AND Z
        return  $calc; 
}

function G($X, $Y, $Z){
        $X = hexdec($X);
        $Y = hexdec($Y);
        $Z = hexdec($Z);
        $calc = (($X & $Z) | ($Y & (~ $Z))); // X AND Z OR Y AND NOT Z
        return  $calc; 
}

function H($X, $Y, $Z){
        $X = hexdec($X);
        $Y = hexdec($Y);
        $Z = hexdec($Z);
        $calc = ($X ^ $Y ^ $Z); // X XOR Y XOR Z
        return  $calc; 
}

function I($X, $Y, $Z){
        $X = hexdec($X);
        $Y = hexdec($Y);
        $Z = hexdec($Z);
        $calc = ($Y ^ ($X | (~ $Z))) ; // Y XOR (X OR NOT Z)
        return  $calc; 
}

/* MD5 round functions */

/*
$A - hex, $B - hex, $C - hex, $D - hex (F - dec)
$M - binary
$s - decimal
$t - hex
*/
function FF(&$A, $B, $C, $D, $M, $s, $t){
        $A = hexdec($A);
        $t = hexdec($t);
        $M = bindec($M);
        $A = ($A + F($B, $C, $D) + $M + $t) & 0xffffffff; //decimal
        $A = rotate($A, $s);
        $A = dechex((hexdec($B) + hexdec($A)) & 0xffffffff);
}

function GG(&$A, $B, $C, $D, $M, $s, $t){
        $A = hexdec($A);
        $t = hexdec($t);
        $M = bindec($M);
        $A = ($A + G($B, $C, $D) + $M + $t) & 0xffffffff; //decimal
        $A = rotate($A, $s);
        $A = dechex((hexdec($B) + hexdec($A)) & 0xffffffff);
}

function HH(&$A, $B, $C, $D, $M, $s, $t){
        $A = hexdec($A);
        $t = hexdec($t);
        $M = bindec($M);
        $A = ($A + H($B, $C, $D) + $M + $t) & 0xffffffff; //decimal
        $A = rotate($A, $s);
        $A = dechex((hexdec($B) + hexdec($A)) & 0xffffffff);
}

function II(&$A, $B, $C, $D, $M, $s, $t){
        $A = hexdec($A);
        $t = hexdec($t);
        $M = bindec($M);
        $A = ($A + I($B, $C, $D) + $M + $t) & 0xffffffff; //decimal
        $A = rotate($A, $s);
        $A = dechex((hexdec($B) + hexdec($A)) & 0xffffffff);
}

// shift
function rotate ($decimal, $bits) { //returns hex
    return dechex((($decimal << $bits) |  ($decimal >> (32 - $bits))) & 0xffffffff);
}

function addVars(&$a, &$b, &$c, &$d, $A, $B, $C, $D){
        $A = hexdec($A);
        $B = hexdec($B);
        $C = hexdec($C);
        $D = hexdec($D);
        $aa = hexdec($a);
        $bb = hexdec($b);
        $cc = hexdec($c);
        $dd = hexdec($d);

        $aa = ($aa + $A) & 0xffffffff;
        $bb = ($bb + $B) & 0xffffffff;
        $cc = ($cc + $C) & 0xffffffff;
        $dd = ($dd + $D) & 0xffffffff;

        $a = dechex($aa);
        $b = dechex($bb);
        $c = dechex($cc);
        $d = dechex($dd);
}

function leftpad($needs_padding, $alignment)
{
    if (strlen($needs_padding) % $alignment) {
      $pad_amount    = $alignment - strlen($needs_padding) % $alignment;
      $left_pad      = implode('', array_fill(0, $pad_amount, '0'));
      $needs_padding = $left_pad . $needs_padding;
  }
  return $needs_padding;
}
Inshallah
Thank you so much man, this is an incredible answer. You've been a serious help.
blake
@blake - No problem. I should have been using the time to get some work done, but answering questions is more fun :-)
Inshallah
+1 for not doing your work.
Yehonatan
This doesn't product correct MD5 hashes for me
rocketeerbkw
@rocketeerbkw that's right, there is still a bug - I pointed that out in the answer.
Inshallah
@Inshallah I'm well versed in php but a complete algorithm (bit level manipulation) novice. Are you saying there is a bug other than the empty string one you mentioned?
rocketeerbkw
@rocketeerbkw I didn't notice any bugs besides the one I mentioned. Also, I said something about a string made up exclusively of zero characters. I think I meant, a string made up exclusively of '0' characters. Can't really remember, was a long time ago :-). Did you run into any problems with the corrected algorithm?
Inshallah
@Inshallah without editing your code, MD('test') does not produce a valid hash. I have noticed that different hashes are produced depending on the version of PHP (broken in 4.4.9, different in 5.2 and 5.3). I'll have to come back to this when I understand more.
rocketeerbkw