I'm using HTML entities on a string of rich html text, but some characters like  are still coming through... how can I either force english-only while still preserving html formatting or force them to HTML?
A:
Try something like this:
$arr_busca = array('á','à','â','ã','ª','Á','À',
'Â','Ã', 'é','è','ê','É','È','Ê','í','ì','î','Í',
'Ì','Î','ò','ó','ô', 'õ','º','Ó','Ò','Ô','Õ','ú',
'ù','û','Ú','Ù','Û','ç','Ç','Ñ','ñ');
$arr_susti = array('a','a','a','a','a','A','A',
'A','A','e','e','e','E','E','E','i','i','i','I','I',
'I','o','o','o','o','o','O','O','O','O','u','u','u',
'U','U','U','c','C','N','n');
$nom_archivo = trim(str_replace($arr_busca, $arr_susti, $nom_archivo));
return $nom_archivo;
I got it directly from the php.net str_replace page, which is why the variables are in spanish...
As mentioned in the comments, this is an incomplete list of characters to check for, but this is the idea on how to check and replace them. You might want to search for a library.
Chacha102
2009-07-25 23:02:29
That list is horribly incomplete, and I don't think that it's safe to rely on. It doesn't include any of the character used by languages in northern europe such as å, ä, ö and ü.
Emil H
2009-07-25 23:30:12
Which is why I said I got it directly from the PHP page.
Chacha102
2009-07-25 23:47:47
Yes it is very incomplete, but that is the technology to use to replace the letters with the English equivalent. I'm sure there is an actual library that will do it better.
Chacha102
2009-07-25 23:49:32
+3
A:
I think the following piece of code from phpbuilder seems reasonable. It checks some input ($string) for invalid characters using regex.
if(preg_match("@[^a-zA-Z0-9\~`\!\@#$%\^&\*\(\)_\-\+\=\{\}\[\]\'\"\:\;\?\/\>\<\.\,\|]*@", $string) {
// There are non-english characters....
} else {
// There are no non-english characters
}
Good luck.
anderstornvig
2009-07-25 23:17:51