views:

77

answers:

1

Hello all,

I have a string that may or may not have unicode characters in it, I am trying to write that to a file on windows. Below I have posted a sample bit of code, my problem is that when I fopen and read the values back out windows, they are all being interpreted as UTF-16 characters.

char* x = "Fool";
FILE* outFile = fopen( "Serialize.pef", "w+,ccs=UTF-8");
fwrite(x,strlen(x),1,outFile);
fclose(outFile);

char buffer[12];
buffer[11]=NULL;
outFile = fopen( "Serialize.pef", "r,ccs=UTF-8");
fread(buffer,1,12,outFile);
fclose(outFile);

The characters are also interpreted as UTF-16 if I open the file in wordpad etc. What am I doing wrong?

+2  A: 

Yes, when you specify that the text file should be encoded in UTF-8, the CRT implicitly assumes that you'll be writing Unicode text to the file. Not doing so doesn't make sense, you wouldn't need UTF-8. This will work proper:

wchar_t* x = L"Fool";
FILE* outFile = fopen( "Serialize.txt", "w+,ccs=UTF-8");
fwrite(x, wcslen(x) * sizeof(wchar_t), 1, outFile);
fclose(outFile);

Or:

char* x = "Fool";
FILE* outFile = fopen( "Serialize.txt", "w+,ccs=UTF-8");
fwprintf(outFile, L"%hs", x);
fclose(outFile);
Hans Passant
Of course you'll be writing Unicode text to the file, but the point is that the CRT assumes you'll be writing **UTF-16**.
dan04
@dan - no, it assumes you'll be writing wchar_t. That it is utf-16 on Windows is an implementation detail.
Hans Passant