views:

153

answers:

4

I know utf8,but what's the difference between *.utf8?

From the answer to my post

A: 

In which context? ja_JP tells us that the string is in the Japanese language. That does not have anything to do with the character encoding, but is probably used - depending on context - for sorting, keyboard input and language on displayed text in the program.

Emil Vikström
A: 

At a guess, I'd say each utf8 file with that naming convention contains a language definition for translating your site.

adam
+1  A: 
Locale = ja_JP 
Encoding = UTF-8
S.Mark
Where to look up locales for different countries?
unicode.org's cldr repo, http://www.unicode.org/repos/cldr/tags/release-1-7/common/main/
S.Mark
A: 

Before Unicode, handling non-english characters was done using tricks like Code Pages (like this) and special character sets (like this: Shift_JIS). UTF-8 contains a much larger range of characters with a completely different mapping system (i.e. the way each character is addressed by number).

When setting ja_JP.UTF8 as the locale, the "UTF8" part signifies the encoding for the special characters needed. For example, when you output a currency amount in the Japanese locale, you will need the ¥ character. The encoding information defines which character set to use to display the ¥.

I'm assuming there could exist a ja_JP.Shift_JIS locale. One difference to the UTF8 one - among others - would be that the ¥ sign is displayed in a way that works in this specific encoding.

Why ja_JP?

The two codes ja_JP signify language (I think based on this ISO norm) and country (based on this one). This is important if a language is spoken in more than one country. In the german speaking area, for example, the Swiss format numbers differently than the germans: 1'000'000 vs. 1.000.000. The country code serves to define these distinctions within the same language.

Pekka