For the Char data-type, how do I specify that I want to use the Turkish i instead of the English i for the toLower and toUpper functions?
The locale has no impact on the default `Data.Char` library.
grddev
2010-08-05 08:40:26
+12
A:
The Data.Char
library in Haskell is not locale dependent. It works for all Unicode characters, but perhaps not in the way you would expect. In the corresponding Unicode chart you can see the mappings for "dotted"/"dotless" i's.
toUpper 'i'
=>'I'
toUpper 'ı'
=>'I'
toLower 'I'
=>'i'
toLower 'İ'
=>'i'
Thus, it is clear that neither of the two transforms are reversible. If you want reversible handling of Turkish characters, it seems you have to use either a C-library or roll your own.
UPDATE: The Haskell 98 report makes this quite clear, whereas the Haskell 2010 report only says that Char
corresponds to a Unicode character, and does not as clearly define the semantics of toLower
and toUpper
.
grddev
2010-08-05 08:39:52
@Alexandre: I documented how Haskell work, and what the (linked) Unicode specification says. If you want other behavior, you need to implement your own (as in jrockway's reply).
grddev
2010-08-05 17:22:21
+7
A:
A Simple Matter Of Programming:
import qualified Data.Char as Char
toLower 'I' = 'ı'
toLower x = Char.toLower x
Then
toLower <$> "I AM LOWERCASE" == "ı am lowercase"
jrockway
2010-08-05 15:07:47
Are you really telling me that I have to hack every library that calls Char.toLower in order to support internationalization?
Jonathan Allen
2010-08-05 18:48:12
@Jonathan: Yes, because the Haskell specification only says to follow the Unicode standard, which provides the rules I gave above. Thus any library that uses `Char.toLower` is not prepared for internationalization.
grddev
2010-08-05 19:04:59
@Jonathan Allen: If you don't want the standard Unicode behavior, then no, you can't use libraries that follow the Unicode standard. It's unfortunate, but pretty plainly so.
Chuck
2010-08-05 23:47:41
I should clarify that this is not the best possible solution. It would be good to write a library that is more flexible than Data.Char, and the community would surely appreciate any contributions in that area.
jrockway
2010-08-06 01:00:06