views:

188

answers:

1

First off, this is NOT a duplicate of: http://stackoverflow.com/questions/1911053/turn-a-c-string-with-null-bytes-into-a-char-array , because the given answer doesn't work when the char *'s are Unicode.

I think the problem is that because I am trying to use UTF-8 encoded char *'s instead of ASCII char *'s, and the length of each character is different and thus, this doesn't work :

char *Buffer;             // your null-separated strings
char *Current;            // Pointer to the current string
// [...]
for (Current = Buffer; *Current; Current += strlen(Current) + 1)
  printf("GetOpenFileName returned: %s\n", Current);

Does anyone have a similar solution that works on Unicode strings?

I have been banging my head on the this for over 4 hours now. C doesn't agree with me.

EDIT: I think that the problem is that the char * is now UTF-8 instead of ASCII.

+2  A: 

Don't use char*. Use wchar_t* and the related functions

wchar_t *Buffer;             // your null-separated strings
wchar_t *Current;            // Pointer to the current string
// [...]
for (Current = Buffer; *Current; Current += wstrlen(Current) + 1)
  wprintf(L"GetOpenFileName returned: %s\n", Current);

Incidentally, wchar_t is 16 bits on Windows, not variable-width. If your source data is UTF8-encoded as char*, you should first convert it to wchar_t* to work with it.

Marcelo Cantos
I tried that. I choose the file H:\files.txt and this is the output: http://pastebin.com/s9vJYiFp
Ramblingwood
how do I convert it from UTF-8 char * to wchar_t* ?
Ramblingwood
If I leave everything the same (just using UTF-8 char*) I only get the first result (`H:\ `) not `H:\ ` and `files.txt` like I am expecting.
Ramblingwood
Your pastebin output happens because your data is already UTF-16, but you are holding it with a `char*` pointer, so it looks like `"H\0:\0\\\0f\0i\0l\0e\0s\0.\0t\0x\0t\0"`, i.e., lots of single-character strings. I'll need to see more surrounding code to understand what you're doing wrong.
Marcelo Cantos
Ramblingwood
If I try doing `for(Current = szFile...` I get something that supports your hypothesis: http://pastebin.com/E6aSeHwh So now what do I do?
Ramblingwood
Since that comment can't be edited...If I try doing `for(Current = szFile...` and selected 2 files (H:\files.txt and H:\md5.php) I get: http://pastebin.com/E6aSeHwh So now what do I do?
Ramblingwood
I'm confused. Why are you trying to convert to UTF-8? Just use the LPCTSTR type directly with the TCHAR versions of functions (`_tcslen` and `_tprintf`).
Marcelo Cantos
I don't know what I was doing! Thanks for your help!
Ramblingwood