We have a set of applications that were developed for the ASCII character set. Now, we're trying to install it in Iceland, and are running into problems where the Icelandic characters are getting screwed up.
We are working through our issues, but I was wondering: Is there a good "guide" out there for writing C++ code that is designed for 8-bit characters and which will work properly when UTF-8 data is given to it?
I can't expect everyone to read the whole Unicode standard, but if there is something more digestible available, I'd like to share it with the team so we don't run into these issues again.
Re-writing all the applications to use wchar_t or some other string representation is not feasible at this time. I'll also note that these applications communicate over networks with servers and devices that use 8-bit characters, so even if we did Unicode internally, we'd still have issues with translation at the boundaries. For the most part, these applications just pass data around; they don't "process" the text in any way other than copying it from place to place.
The operating systems used are Windows and Linux. We use std::string and plain-old C strings. (And don't ask me to defend any of the design decisions. I'm just trying to help fix the mess.)
Here is a list of what has been suggested: