I've got a bunch of data which could be mixed characters, special characters, and 'accent' characters, etc.
I've been using php inconv with translit, but noticed today that a bullet point gets converted to 'bull'. I don't know what other characters like this don't get converted or deleted. $, *, %, etc do get removed.
Basically what I'm trying to do is keep letters, but remove just the 'non-language' bits.
This is the code I've been using
$slugIt = @iconv('UTF-8', 'ASCII//TRANSLIT', $slugIt); $slugIt = preg_replace("/[^a-zA-Z0-9 -]/", "", $slugIt);
of course, if I move the preg_replace to be above the inconv function, the accent characters will be removed before they are translated, so that doesn't work either.
Any ideas on this? or what non-letter characters are missed in the TRANSLIT?
---------------------Edited--------------------------------- Strangely, it doesn't appear to be the TRANSLIT which is changing a bullet to 'bull'. I commented out the preg-replace, and the 'bull' has been returned to a bullet point. Unfortunately I'm trying to use this to create readable urls, as well as a few other things, so I would still need to do url encoding.