For example,both , and , are commas,but the first one takes 2 byte,while the second one only 1.
How to convert the 2 byte one to 1 byte?
For example,both , and , are commas,but the first one takes 2 byte,while the second one only 1.
How to convert the 2 byte one to 1 byte?
You can use iconv
with the translit option (e.g., ASCII/TRANSLIT
or ISO-8859-1//TRANSLIT
, etc. depending on how you serve your content). I haven't tried this out, but I believe it will work.
If you're not sure what the incoming charset will be, you probably want to use mb_detect_encoding
to detect it, because iconv
will freak out if it encounters a character it doesn't expect.
You may wish to read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets for a painless intro to the subject, if you're unfamiliar with charsets.