tags:

views:

114

answers:

1

directory_iterator returns UTF8 using both Visual Studio and Xcode as expected.

wdirectory_iterator, however, returns UTF16 using Visual Studio, and UTF8 using Xcode, despite returning a wchar_t string.

What can I change to get wdirectory_iterator to return UTF32?

An answer to a question I asked previously suggests that changing the locale might be required, however according to 'locale -a' the only locales available are

en_GB, en_GB.ISO8859-1, en_GB.ISO8859-15, en_GB.US-ASCII, en_GB.UTF-8 All are 8 bit, with the possible exception of en_GB

I tried en_GB in case it might not be 8 bit, but this causes boost::filesystem::exists to throw a boost::filesystem::wpath::to_external conversion exception.

+1  A: 

wdirectory_iterator is a typedef for basic_directory_iterator<wpath>. wpath is a typedef for basic_path<std::wstring, wpath_traits>.

Similarily to what is done in std::basic_filebuf, A basic_path uses an "internal" encoding to represent names to the program, and an "external" encoding to interact with the platform's filesystem. Conversion between these encodings is done like in std::basic_filebuf, by using the std::codecvt of the locale imbued on it.

So, names are obtained by the iterator from the operating system in the system's encoding (that's the "external" encoding), and converted to the "internal" encoding with Traits::to_internal. To perform the desired conversion, you can thus:

  • Call wpath_traits::imbue() "early" in your program, passing it a locale with a codecvt facet performing UTF8->UTF32 conversion
  • Or define and use you own Traits class, where you implement to_internal to perform a UTF8->UTF32 conversion
Éric Malenfant