Short answer:
No conversion required if you use Unicode strings such as CString or wstring. Use sqlite3_open16().
You will have to make sure you pass a WCHAR pointer (casted to void *
. Seems lame! Even if this lib is cross platform, I guess they could have defined a wide char type that depends on the platform and is less unfriendly than a void *
) to the API. Such as for a CString: (void*)(LPCWSTR)strFilename
The longer answer:
You don't have a Unicode string that you want to convert to UTF8 or UTF16. You have a Unicode string represented in your program using a given encoding: Unicode is not a binary representation per se. Encodings say how the Unicode code points (numerical values) are represented in memory (binary layout of the number). UTF8 and UTF16 are the most widely used encodings. They are very different though.
When a VS project says "Unicode charset", it actually means "characters are encoded as UTF16". Therefore, you can use sqlite3_open16() directly. No conversion required. Characters are stored in WCHAR type (as opposed to char
) which takes 16 bits (Fallsback on standard C type wchar_t
, which takes 16 bits on Win32. Might be different on other platforms. Thanks for the correction, Checkers).
There's one more detail that you might want to pay attention to: UTF16 exists in 2 flavors: Big Endian and Little Endian. That's the byte ordering of these 16 bits. The function prototype you give for UTF16 doesn't say which ordering is used. But you're pretty safe assuming that sqlite uses the same endian-ness as Windows (Little Endian IIRC. I know the order but have always had problem with the names :-) ).
EDIT: Answer to comment by Checkers:
UTF16 uses 16 bits characters. Under Win32, wchar_t
is used for such storage unit. The trick is that some Unicode characters require a sequence of 2 such 16-bits characters. They are called Surrogate Pairs.
The same way an UTF8 represents 1 character using a 1 to 4 bytes sequence. Yet UTF8 are used with the char type.