views:

80

answers:

1

I am using the ICU library in C++ on OS X. All of my strings are UnicodeStrings, but I need to use system calls like fopen, fread and so forth. These functions take const char* or char* as arguments. I have read that OS X supports UTF-8 internally, so that all I need to do is convert my UnicodeString to UTF-8, but I don't know how to do that.

UnicodeString has a toUTF8() member function, but it returns a ByteSink. I've also found these examples: http://source.icu-project.org/repos/icu/icu/trunk/source/samples/ucnv/convsamp.cpp and read about using a converter, but I'm still confused. Any help would be much appreciated. Thank you!

+1  A: 

call UnicodeString::extract(...) to extract into a char*, pass NULL for the converter to get the default converter (which is in the charset which your OS will be using).

Steven R. Loomis
Thank you! That does work. I'm not sure about the destCapacity argument and the length of the UnicodeString. This code works: http://codepad.org/blaSP0ex but you'll notice I double the .length() of the UnicodeString manually to make up for the multibyte string. How can I make sure there is enough space in my char* dest?
Isaac
http://icu-project.org/apiref/icu4c/classUnicodeString.html#125255f27efd817e38806d76d9567345It will return the length needed for the output string and a U_BUFFER_OVERFLOW_ERROR in status if there wasn't enough space. See http://userguide.icu-project.org/strings#TOC-Using-C-Strings:-NUL-Terminated-vs%2e
Steven R. Loomis
Thank you. The documentation says that it's best to guess the size and if there's a buffer overflow error, then to call the extract function again with the length returned from the first call. I do that here: http://codepad.org/nyp5yJWB but the second call still fails, even though I provide it with the correct length returned from the first extract call. What am I doing wrong?
Isaac
I forgot delete[] instead of delete, and I don't think I need sizeof (I use C usually), but those are minor details.
Isaac
That's right, but you need to reset the error code after the failure. ICU functions just exit if the error is already set. http://userguide.icu-project.org/design#TOC-Error-Handling
Steven R. Loomis
Thank you! Everything works now. I don't mean to keep pestering you, but it just seems like you're the only one who knows anything about ICU.
Isaac