Hello, I just want to write some few simple lines to a text file in C++, but I want them to be encoded in UTF-8. What is the easiest and simple way to do so?
Thanks
Hello, I just want to write some few simple lines to a text file in C++, but I want them to be encoded in UTF-8. What is the easiest and simple way to do so?
Thanks
libiconv is a great library for all our encoding and decoding needs.
If you are using Windows you can use WideCharToMultiByte and specify that you want UTF8.
If by "simple" you mean ASCII, there is no need to do any encoding, since characters with an ASCII value of 127 or less are the same in UTF-8.
The only way UTF-8 affects 'std::string' is that size()
, length()
, and all the indices are measured in bytes, not characters.
And, as sbi points out, incrementing the iterator provided by std::string
will step forward by byte, not by character, so it can actually point into the middle of a multibyte UTF-8 codepoint. There's no UTF-8-aware iterator provided in the standard library, but there are a few available on the 'Net.
If you remember that, you can put UTF-8 into std::string
, write it to a file, etc. all in the usual way (by which I mean the way you'd use a std::string
without UTF-8 inside.
You may want to start your file with a byte order mark so that other programs will know it is UTF-8.