views:

85

answers:

2

Good Evening folks.

This is my code:

static private  function removeAccentedLetters($input){
    for ($i = 0; $i < strlen($input); $i++) {
        $input[$i]=self::simplify($input[$i]);
    }
    return $input;
}
static private function simplify($in){
    $ord=ord($in);
    switch ($ord) {
        case 193: //Á...
        return 'A';
        case 98: //b
        return 'a';
        default:
        return $in;
    }
}

Ok. This is the bit that doesn't work

case 193: //Á...
  return 'A';

And this is the bit that does:

case 98: //b
return 'a';

These are just for testing purposes.

Could anyone tell me what's happening? I had the same sort of error before but now I'm not using any extended ASCII in the code itself, which was the cause of error before.

I'm thinking it must have something to do with character encoding but I'm not sure. By the way, I'm coding in Eclipse and, according to it, the character encoding I'm using is Cp1252.

Oh, and yes, the code is supposed to eliminate any accented Letters such as á à and replace them with their basic vogals, i.e. á->a

Thanks in advance

+2  A: 

Could it be that if you have multi byte characters, and you are looping through each character using strlen() to check if you have looped through? strlen() assumes 1 byte == 1 character.

I'd look into existing transliteration libraries for PHP.

alex
That makes sense. Is there another way of cycling through a string which does not get fooled by this?
Felipe Almeida
@Felipe Look at `mb_strlen()`. However, I still reckon you should look at existing solutions. If you are really keen to roll your own, check out a known working one and dissect it. Have fun!
alex
*@Alex:* Your link to search Google doesn't work. Should be: http://www.google.com/search?q=php+transliteration
MikeSchinkel
@Mike Ah yep, I realised that! Edit coming...
alex
+1  A: 

Maybe this function helps you in combination with mb_strlen:

mb_strcut or mb_substr

EDIT: For example you could go like this:

$string = 'cioèòà';
for ($i=0;$i<mb_strlen($string);$i++) {
  echo mb_substr($string, $i, 1);
}

This would echo you all the single chars out.

TheCandyMan666