I know utf8,but what's the difference between *.utf8?
In which context? ja_JP tells us that the string is in the Japanese language. That does not have anything to do with the character encoding, but is probably used - depending on context - for sorting, keyboard input and language on displayed text in the program.
At a guess, I'd say each utf8 file with that naming convention contains a language definition for translating your site.
Before Unicode, handling non-english characters was done using tricks like Code Pages (like this) and special character sets (like this: Shift_JIS). UTF-8 contains a much larger range of characters with a completely different mapping system (i.e. the way each character is addressed by number).
When setting ja_JP.UTF8
as the locale, the "UTF8" part signifies the encoding for the special characters needed. For example, when you output a currency amount in the Japanese locale, you will need the ¥ character. The encoding information defines which character set to use to display the ¥.
I'm assuming there could exist a ja_JP.Shift_JIS
locale. One difference to the UTF8 one - among others - would be that the ¥ sign is displayed in a way that works in this specific encoding.
Why ja_JP?
The two codes ja_JP
signify language (I think based on this ISO norm) and country (based on this one). This is important if a language is spoken in more than one country. In the german speaking area, for example, the Swiss format numbers differently than the germans: 1'000'000
vs. 1.000.000
. The country code serves to define these distinctions within the same language.