views:

341

answers:

3
NSString *theString = @"a %C3%B8 b";

NSLog(@"%@", theString);

NSString *utf8string = [theString stringByReplacingPercentEscapesUsingEncoding: NSUTF8StringEncoding]

NSLog(@"%@", utf8string);

const char *theChar = [utf8string UTF8String];

NSLog(@"%s", theChar);

This logs the following:

'a %C3%B8 b'

'a ø b'

'a √∏ b'

The problem is that I want theChar to be 'a ø b'. Any help on how to achieve that would be greatly appreciated.

A: 

From String Format Specifiers in String Programming Guide:

%s : Null-terminated array of 8-bit unsigned characters. %s interprets its input in the system encoding rather than, for example, UTF-8.

So NSLog(@"%s", theChar) creates and displays NSString object with wrong encoding and theChar itself contains correct string data.

NSLog([NSString stringWithUTF8String:theChar]);

Gives the correct output. (a ø b)

Vladimir
+1  A: 

I don't think you can. char is a eight bit type so all values are between 0-255. In UTF8 the ø is not encoded in that range.

You might want to look at the unicode type which is a 16 bit type. This can hold the ø as one item and use getCharacters:range: to get the characters out of the NSString

Mark
thank you to both Vladimir and Mark. Your info was clarifying.
Dagligleder
A: 

I'd like to add that your theChar does contain the UTF8 byte sequence of your desired string. It's the problem of NSLog("%s") that it can't show the string correctly into the log file and/or the console.

So, if you want to pass the UTF8 byte sequence in char* to some other library, what you did is perfectly correct.

Yuji