views:

173

answers:

1

It has been mentioned in several sources that C++0x will include better language-level support for Unicode(including types and literals).

If the language is going to add these new features, it's only natural to assume that the standard library will as well. However, I am currently unable to find any references to the new standard library. I expected to find out the answer for these answers:

  1. Does the new library provide standard methods to convert UTF-8 to UTF-16, etc.?
  2. Does the new library allowing writing UTF-8 to files, to the console (or from files, from the console). If so, can we use cout or will we need something else?
  3. Does the new library include "basic" functionality such as: discovering the byte count and length of a UTF-8 string, converting to upper-case/lower-case(does this consider the influence of locales?)

Finally, are any of these functions are available in any popular compilers such as GCC or Visual Studio?

I have tried to look for information, but I can't seem to find anything. I am actually starting to think that maybe these things aren't even decided yet(I am aware that C++0x is a work in progress).

+9  A: 

Does the new library provide standard methods to convert UTF-8 to UTF-16, etc.?
No. The new library does provide std::codecvt facets which do the conversion for you when dealing with iostream, however. ISO/IEC TR 19769:2004, the C Unicode Technical Report, is included almost verbatim in the new standard.

Does the new library allowing writing UTF-8 to files, to the console (or from files, from the console). If so, can we use cout or will we need something else?
Yes, you'd just imbue cout with the correct codecvt facet. Note however that the console is not required to display those characters correctly

Does the new library include "basic" functionality such as: discovering the byte count and length of a UTF-8 string, converting to upper-case/lower-case(does this consider the influence of locales?)
AFAIK that functionality exists with the existing C++03 standard. std::toupper and std::towupper of course function just as in previous versions of the standard. There aren't any new functions which specifically operate on unicode for this.

If you need these kinds of things, you're still going to have to rely on an external library -- the <iostream> is the primary piece that was retrofitted.

What, specifically, is added for unicode in the new standard?

  • Unicode literals, via u8"", u"", and U""
  • std::char_traits classes for UTF-8, UTF-16, and UTF-32
  • mbrtoc16, c16rtomb, mbrtoc32, and c32rtomb from ISO/IEC TR 19769:2004
  • std::codecvt facets for the locale library
Billy ONeal