I am working with a Wikipedia XML dump that is encoded in UTF-8. Right now, I am reading in everything as std::string, so when I std::cout to the screen, foreign characters are displayed as jibberish.
The actual parsing process only looks for ASCII characters though, but when I write the parsed file to disk, I want to preserve the foreign characters. In other words, I want the output to have the same encoding as the input.
Is it OK to use std::string, or am I going to have to use something like ICU? The libraries I have looked at seem overly complicated. Is there something quick I can use to do this?