views:

37

answers:

2

Hi, I'm trying to upgrade from PHP 5.2.x to 5.3.2 on my server. Problem is, I relying on the broken implementation of PHP's ezmlm_hash() (the bug is outlined here: http://bugs.php.net/bug.php?id=47969).

My first thought was to rewrite the broken version of the native PHP function (which is written in C) myself in PHP and use that in my code, instead of modifying the PHP source code and having to compile PHP from source.

Here is the C version of the code:

PHP_FUNCTION(ezmlm_hash)
{
    char *str = NULL;
    unsigned int h = 5381L;
    int j, str_len;

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s",
                              &str, &str_len) == FAILURE) {
        return;
    }

    for (j = 0; j < str_len; j++) {
        h = (h + (h << 5)) ^ (unsigned long) (unsigned char) tolower(str[j]);
    }

    h = (h % 53);

    RETURN_LONG((int) h);
}

here is what I've written in PHP:

function ezmlm_hash_mine($email_address){
    $h = 5381;
    $email_length = strlen($email_address);
    for($x=0;$x<$email_length;$x++){
        $chr = strtolower($email_address[$x]);
        $h = ($h + ($h << 5)) ^ ( ord($chr) );
    }

    $h = $h % 53;
    return $h;
}

I'm using a 64-bit machine. The two functions output different results:

$email_addresses = array(
    '[email protected]',
    '[email protected]',
);

print('<PRE>');

foreach($email_addresses as $email_address){
    print(ezmlm_hash($email_address).PHP_EOL);
    print(ezmlm_hash_mine($email_address).PHP_EOL.PHP_EOL);
}

output:

23
-52

15
-21

I know I probably have some precision or typing issues, I'm just not sure how to fix it. Any help would be greatly appreciated!

UPDATE

When I run thes the code on 32 bit machines, they both output the new corrected values:

12
12

45
45

I think this has something to do with the modulo operator... does anyone know the PHP equivalent of the C modulo operator? % in PHP behaves differently!

UPDATE 2

It appears as if this is not possible with vanilla PHP, as it's floating point arithmetic doesn't have enough precision, and weirdness in . I'll have to install either BCMath or GMP. Thanks for everyone's insight.

A: 

Problem is probably in this line $h = ($h + ($h << 5)) ^ ( ord($chr) );

In C before applying xor, single character is cast to a 4byte long. Though I'm not sure what type ord returns, try breaking the expression in smaller expression and testing weather they behave the same in C and PHP

like h + (h << 5) should behave exactly the same in C and PHP for every h.

Ivan
I thought that's what php's ord() did? What should I be doing otherwise?
Mike Sherov
ok sorry, you're probably right, another thing you might want to look into is that PHP does not support unsigned integers, shifting operations behave differently on signed and unsigned integers
Ivan
it's not the shift, left shifts are the same for signed and unsigned values, it's likely the modulus
Spudd86
@spudd86, I think that's more on the right track... is there any function that emulates the c version of modulus in PHP?
Mike Sherov
You could just build the C hash Function as PHP extension. Is that an option?
Ivan
@Ivan, it's an option, but I don't want to have to add that step to our build process when we deploy new machines or upgrade them. I'd rather just solve it in PHP. Thanks for the input though.
Mike Sherov
+1  A: 

try this EDIT truncate to 32 bits after calculation:

function ezmlm_hash_mine($email_address){
    $h = gmp_init(5381);
    $d = gmp_setbit(0, 64);
    $d32 = gmp_setbit(0, 32);
    $email_length = strlen($email_address);

    $chr = strtolower($email_address);

    for($x=0;$x<$email_length;$x++){    
        $h = gmp_mod(gmp_xor(gmp_mod(gmp_add($h, gmp_mod(gmp_mul($h, "32"), $d)), $d), ord($chr[$x])), $d32);
    }

    $h = gmp_mod($h, 53);
    return gmp_intval($h);
}
Spudd86
I do not have GMP installed, and while I'm sure this works, I'm trying to not have to add anything to my PHP installation other than the straight upgrade. +1 though.
Mike Sherov
unfortunately I just tried it, it has a bug somewhere....
Spudd86