ansaurus

Question

How do you convert character case in UNIX accurately? (assuming i18N)

Answer 1

+2 A:

Perl lc/uc works fine for most languages but it won't work with Turkish correctly, see this bug report of mine for details. But if you don't need to worry about Turkish, Perl is good to go.

cartman 2009-06-04 19:37:13

Well, Turkish "i" is a common source of i18n/L10n related problems.

Paweł Dyda 2010-09-12 18:45:31

Answer 2

+1 A:

You can't be sure that text will be correct in every locale. That's not possible, there are always some errors in software libraries regarding implementation of i18n related staff.

If you're not afraid of using C++ or Java, you may take a look at ICU which implement broad set of collation, normalization, etc. rules.

Paweł Dyda 2010-09-12 18:53:55

ansaurus

tags:

views:

answers:

How do you convert character case in UNIX accurately? (assuming i18N)

related questions