I'm looking for a portable and easy-to-use string library for C/C++, which helps me to work with Unicode input/output. In the best case, it will store its strings in memory in UTF-8, and allow me to convert strings from ASCII to UTF-8/UTF-16 and back. I don't need much more besides that (ok, a liberal license won't hurt). I have seen that C++ comes with a <locale>
header, but this seems to work on wchar_t
only, which may or may not be UTF-16 encoded, plus I'm not sure how good this is actually.
Uses cases are for example: On Windows, the unicode APIs expect UTF-16 strings, and I need to convert ASCII or UTF-8 strings to pass it on to the API. Same goes for XML parsing, which may come with UTF-16, but I actually only want to process internally with UTF-8 (or, for that matter, if I switch internally to UTF-16, I'll need a conversion to that anyway).
So far, I've taken a look at the ICU, which is quite huge. Moreover, it wants to be built using it own project files, while I'd prefer a library for which there is either a CMake project or which is easy to build (something like compile all these .c files, link and good to go), instead of shipping something large as the ICU along my application.
Do you know such a library, which is also being maintained? After all, this seems to be a pretty basic problem.