tags:

views:

90

answers:

1

Possible Duplicate:
UTF-8 -> ASCII in C language

how to convert a utf8 string to ascii string ?

+4  A: 

UTF-8 is a superset of ASCII. The character codes 0-127 (i.e. the ASCII characters) are directly mapped to the binary values 0-127. If you want to convert UTF-8 to ASCII, you can simply remove all bytes that are >= 128. This means that non-ASCII characters will be ignored in the converted string - if that is what you want.

Mind that for UTF-8 decoding, you need to detect characters that are encoded as multiple bytes. The number of bytes is the number of '1' bits left of the leftmost '0' bit, and this only applies to bytes >= 128. For example, 11000000 is the first byte of a character that was encoded to two bytes (it has two significant '1' bits). That means you also have to remove the following byte.

As the bytes that belong to a multi-byte-encoded character are always >= 128, you can just forget about the paragraph above :)

AndiDog
+1 Nice. I like your approach LOL
pmg