views:

171

answers:

1

If a file contains a £ (pound) sign then directory_iterator correctly returns the utf8 character sequence \xC2\xA3

wdirectory_iterator uses wide chars, but still returns the utf8 sequence. Is this the correct behaviour for wdirectory_iterator, or am I using it incorrectly?

AddFile(testpath, "pound£sign"); 
wdirectory_iterator iter(testpath);
TS_ASSERT_EQUALS(iter->leaf(),L"pound\xC2\xA3sign"); // Succeeds
TS_ASSERT_EQUALS(*iter, L"pound£sign"); // Fails
+2  A: 

The encoding for wide chars (wchar_t objects) is implementation dependent. For the second statement (i.e. L"pound£sign") to work, you will probably need to change the underlying locale. The default is "C" which does not know about the pound character. The hex value succeeds since this does not require mapping the glyph to a value in a particular encoding.

Note: I am skipping the exact wording of the standard w.r.t wchar_t, extended character sets etc for brevity.

dirkgently
Are you referring to the compiler not reading the wide char _literals_ correctly? Can you tell the compiler what encoding it should use when parsing the source code?
xtofl
Q1) This is possible. Q2) Not in a standard/portable way except for locales.
dirkgently