views:

414

answers:

3

Hi All, My application has to write data to an XML file which will be read by a swf file. The swf expects the data in the XML to be in UTF-8 encoding. I have to convert some Multibyte characters in my app(Chinese simplified, Japanese, Korean etc..) to UTF-8. Are there any API calls which could allow me to do this?I would prefer not to use any 3rd party dlls. I need to do it both on Windows and on Mac and would prefer any system API's if available.

Thanks jbsp72

A: 

"Multibyte" is just an arbitrary non-UTF-8 encoding. Use something like libiconv to convert from that encoding to UTF-8.

I am just a bit surprised that specifying the encoding in the XML file isn't working though...

Ignacio Vazquez-Abrams
I tried setting the encoding to gb2312 for chinese and writing the multibyte characers as it is. This didn't work and I have no clue why. I would prefer using Windows system API's. Aren't there any which would do the job for me.
jbsp72
A: 

I have to convert some Multibyte characters in my app(Chinese simplified, Japanese, Korean etc..) to UTF-8.

if your original string is in multibyte (chinese/arabic/thai/etc..) and you need to convert it to other multibyte (UTF-8), One way is to convert to WideCharacter(UTF-16) first, then convert back to multibyte.

multibyte(chinese/arabic/thai/etc) -> widechar(UTF-16) -> multibyte(UTF-8)

if your original string is already in Unicode(UTF-16), you can skip the first conversion in the above illustration

you can refer the codepage from MSDN.

Google Chrome has some string conversion implementations for Windows, Linux, and Mac. You can see it here or here. the files are under src/base:
+ sys_string_conversions.h
+ sys_string_conversions_linux.cc
+ sys_string_conversions_win.cc
+ sys_string_conversions_mac.mm

The code uses BSD license so you can use it for commercial projects.

afriza
+1  A: 

UTF-8 is a multibyte encoding (Well, a variable byte-length encoding to be precise). Stating that you need to convert from a multibyte encoding is not enough. You need to specify which multibye encoding your source is?

troelskn
Also, please specify the environment/language you need this for.
Bandi-T