ansaurus

Question

Reading and outputting UTF-8 strings in c/cocoa

Answer 1

+1 A:

UTF-8 is a multi-byte character set (see wikipedia), which means some characters require multiple bytes (the accented ones you've run into). C's char type is a single byte, so C's definition of "character" doesn't match Unicode's.

If you want to read Unicode with the standard C RTL, you'll also need to use a Unicode conversion library, such as libiconv.

(Using wchar_t may also work; I've never researched it.)

Or you can use NSString, which already supports Unicode.

Dewayne Christensen 2010-01-22 14:50:39

Thanks, I think I'm getting that part. The bit I don't get now is why I have a valid `char` type string after the cocoa call, but not before. What is it doing to the string to make it suddenly valid?

Ben 2010-01-22 21:51:39

ansaurus

tags:

views:

answers:

Reading and outputting UTF-8 strings in c/cocoa

related questions