I need to compare strings and match names to one another even if they are not spelled the same way.
For example DÉSIRÉ-Smith
should match Desireesmith
as well as Desiree or Desi'ree Smith
So i had the following approch which worked perfectly in the command line using PHP-CLI:
<?
class Alike {
static function convertAlike($string) {
// in case the first and last name or two first names are mixed up
$parts = preg_split('/[\s\-\.\_]/', $string, -1, PREG_SPLIT_NO_EMPTY);
sort($parts);
$string = implode($parts);
$string = iconv('UTF-8', 'ASCII//TRANSLIT', $string); // transliterate
$string = strtolower($string); // lowercase
$string = preg_replace('/[^a-z]/','',$string); // remove everything but a-z
$string = preg_replace('{(.)\1+}','$1',$string); // remove duplicate chars
return $string;
}
static function compareAlike($string1,$string2) {
return (strcmp(Alike::convertAlike($string1),Alike::convertAlike($string2)) === 0) ? true : false;
}
}
echo Alike::convertAlike("DÉSIRÉ-Smith").PHP_EOL; // desiresmith
echo Alike::convertAlike("Desireesmith").PHP_EOL; // desiresmith
echo Alike::convertAlike("Desi'ree Smith").PHP_EOL; // desiresmith
echo Alike::convertAlike("René Röyßeå likes special characters ½ € in ©").PHP_EOL; // reneroysealikespecialcharacterseurinc
var_dump(Alike::compareAlike("DÉSIRÉ-Smith","Desireesmith")); // true
var_dump(Alike::compareAlike("Desireesmith","Desi'ree Smith")); // true
var_dump(Alike::compareAlike("summer","winter")); // false
?>
However in my website running Server version: Apache/2.2.14 (Ubuntu)
running PHP Version 5.3.2-1ubuntu4.2
as module I always get just question signs.
The funny thing is that the error must occour in this line
$string = iconv('UTF-8', 'ASCII//TRANSLIT', $string); // transliterate
because afterwards i can see every character that has not been transliterated, but those that should have been replaced by ascii chars become question signs.
i tried every possible combination of input/output string encoding and iconv internal, input and output encoding settings as well as locale settings. i even did chmod -R 777 /usr/lib/gconv and moved the to my working dir.
however i saw this bug reported ont he mailing list: http://bugs.php.net/bug.php?id=44096
[2010-06-07 21:22 UTC] icovt at yahoo dot com
mod_php iconv() is not working properly if your apache is chrooted and you do not
have the content of /usr/lib/gconv/ folder into your relative chroot path (i.e.
/your/chroot/path/usr/lib/gconv/).
You can simply do:
cp /usr/lib/gconv/* /your/chroot/path/usr/lib/gconv/
... and re-try.
This was a fix for me, hope this could save time for somebody else.
P.S. Btw, initially iconv() called from command line (using php cli) was OK.
i tried that my www-data user is at home in /var/www/ and i ended up with the folder /var/www/usr/lib/gconv/ as well as /var/www/myproject/usr/lib/gconv/
FYI: i had encoding detection and transcoding functinos to ensure the correct encodings to be passed, but removed them for the sake of clarity as they are not needed anway if you input utf8 strings everything should be fine...
any ideas?