views:

26

answers:

1

There are already questions regarding unicode and ini files, but many of them are rather domain-specific. So I am not sure if the answer can be applied to the general case.

Motivation: I want to use ini files for storing simple data like some numbers and some strings. The strings are provided by users (input via GUI). The software could run anywhere in the world, any language can be used. The files also can be shared between users (so they can be written to on one system, read on another and so on).

I figured that unicode in ini files should be no problem when using GetPrivateProfileStringW and WritePrivateProfileStringW (I am targeting systems >= Windows XP).

But then I stumbled upon an answer in this question.

Quote:

The WritePrivateProfileStringW function will write the INI file in legacy system encoding (e.g. Shift-JIS on a Japanese system) because it is a legacy support function. If you want to have a fully Unicode-enabled INI file, you will need to use an external library.

I am unsure now - do I have to worry? Or can I just go ahead and use ini files?

Edit:

It seems the key to avoid random encodings might be to prepare an empty file containing a BOM, then using this file. Has anyone (positive/negative) experience with this?

+1  A: 

The problem is not really with the use of ini files but with the functions you'll use to read from and write to those files.

As you noticed, WritePrivateProfileStringW() will not write UNICODE data to the file. Instead, it will use whatever multi-byte encoding is standard on the system. That means that ini files created on a Japanese system won't be readable on a Russian system. The reverse is also true.

If the files are not intended to be shared by systems with different encodings, you'll be fine. Otherwise, maybe you shouldn't use ini files but a more UNICODE-aware technology, like e.g. XML, whose encoding defaults to UTF-8 on all platforms.

Frédéric Hamidi
I found some more information about how the API seems to work: http://blogs.msdn.com/b/michkap/archive/2006/09/15/754992.aspx Quote: "So Jake, you have your simple fix now -- create a two byte file, containing the BOM, and then let WritePrivateProfileString handle the rest." In the comments Raymond Chen says: "Dean Harding is correct on all counts. It means "If the INI file already exists and appears to be Unicode text" and the way the code determines this is through our favorite dodgy API - IsTextUnicode. (The BOM serves as a big hint.)" Could the BOM be the key?
Heinrich Ulbricht
Yes, apparently if an `ini` file contains a `BOM`, then all reads and writes to it will use `UNICODE`. You'll have to ensure all your files contain such a `BOM`, i.e. you'll have to add it yourself when creating a new file.
Frédéric Hamidi