views:

187

answers:

2

I'm writing a small App in which i read some text from to console, which is then stored in a classic char* string.
As it happens i need to pass it to an lib which only takes UTF-8 encoded Strings. Since the Windows console uses the local Encoding, i need to convert from local encoding to UTF-8.
If i'm not mistaken i could use MultiByteToWideChar(..) to encode to UTF-16 and then use WideCharToMultiByte(..) to Convert to UTF-8.

However i wonder if there is a way to convert directly from local Encoding to UTF-8 without the use of any external Libs, since the idea of converting to wchar just to be able to convert back to char (utf-8 encoded but still) seems kinda weird to me.

+1  A: 

The POSIX world loves the iconv lib for just that. It converts from and to virtually every encoding around, using char*.

moritz
+2  A: 

Converting from UTF-16 to UTF-8 is purely a mechanical process, but converting from local encoding to UTF-16 or UTF-8 involves some large specialized lookup tables. The c-runtime just turns around and calls WideCharToMultiByte and MultiByteToWideChar for non-trivial cases.

As for having to use UTF-16 as an intermediate stage, as far as I know, there isn't any way around that - sorry.

Since you are already linking to an external library to get file input, you might as well link to the same library to get WideCharToMultiByte and MultiByteToWideChar.

Using the c-runtime will make your code re-compilable to other operating systems (in theory), but it also adds a layer of overhead between you and the library that does all of the real work in this case - kernel32.dll.

John Knoeller
It would be just a convenience to be able to do this directly. It does make a difference in my opinion if i have to allocate Memory for the UTF16-String and do error checking on 2 functioncalls instead of just calling one function and checking it for Errors. I guess thats the price you have to pay when trying to stay unicode compliant :)
Andreas Klebinger