views:

1061

answers:

2

How do I convert text between Simplified Chinese GB 2312 or similar multi-byte text strings into UTF8 using c++ ?

+3  A: 

On unix systems you'd best use the iconv library.

See iconv_open, iconv, iconv_close

You'd have to know the character encoding of course (EUC-CN, HZ).

If not on a unix system, search for some support in the OS, doing character conversions by hand is very hard to get right.

Pieter
That's helpful. Thanks
Chris Huang-Leaver
+2  A: 

WinAPI: MultiByteToWideChar and vice versa, WideCharToMultiByte. I can post a sample later.

However, UTF-8 is rather tricky to represent and more specifically, to use, in applications. The MultiByteToWideChar function converts a string to UTF-16 (UCS2). I suggest you use this format in your software internally, and only convert it to UTF-8 using WideCharToMultiByte if your program needs to produce such output. This is the standard way of doing internationalization/unicode on Windows & OS X.

psoul