tags:

views:

153

answers:

4

Hi all . Look at this , i am try

appendFile "out" $ show 'д'

'д' is character from Russian alphabet. After that "out" file contains:

'\1076'

How i understand is the unicode numeric code of character 'д'. Why is it happens ? And How i can to get the normal representation of my character ?

For additional information it is works good:

appendFile "out"  "д"

Thanks.

A: 

A quick web search for "UTF Haskell" should give you good links. Probably the most recommended package is the text package.

import Data.Text.IO as UTF
import Data.Text as T

main = UTF.appendFile "out"  (T.pack "д")
TomMD
+4  A: 

show escapes all characters outside the ASCII range (and some inside the ASCII range), so don't use show.

Since "д" works fine, just use that. If you can't because the д is actually inside a variable, you can use [c] (where c is the variable containing the character. If you need to surround it by single quotes (like show does), you can use ['\'', c, '\''].

sepp2k
I think `show` is highly over-used by many Haskell programmers. It's not suitable for pretty-printing because it's meant to be used for serialization (e.g. `read . show` should equal `id`), but performance is too poor for most serialization applications. It's handy for testing and prototyping, but beyond that I'd think twice about using `show`.
John
I want to use show for debuging. 'show' converts 'data structure' to string. For example i have [(String,String)] and i wish to see it. Of course the best way to out put to console But it is not possible. Because i use file.
Anton
I'd agree that debugging is one of the most common good uses for show. It just gets tricky for situations like yours because of escaping characters outside ASCII (and escaping newline, which is particularly annoying to me).
John
+2  A: 

Use Data.Text. It provides IO with locale-awareness and encoding support.

Don Stewart
Data.Text is great, but the built-in IO system also provides locale-awareness and encoding support (since GHC 6.12).
Simon Marlow
+1  A: 

After reading your reply to my comment, I think your situation is that you have some data structure, maybe with type [(String,String)], and you'd like to output it for debugging purposes. Using show would be convienent, but it escapes non-ASCII characters.

The problem here isn't with the unicode, you need a function that will properly format your data for display. I don't think show is the right choice, in part because of the problems with escaping some characters. What you need is a type class like Show, but one that displays data for reading instead of escaping characters. That is, you need a pretty-printer, which is a library that provides functions to format data for display. There are several pretty-printers available on Hackage, I'd look at uulib or wl-pprint to start. I think either would be suitable without too much work.

Here's an example with the uulib tools. The Pretty type class is used instead of Show, the library comes with many useful instances.

import UU.PPrint

-- | Write each item to StdOut
logger :: Pretty a => a -> IO ()
logger x = putDoc $ pretty x <+> line

running this in ghci:

Prelude UU.PPrint> logger 'Д'
Д 
Prelude UU.PPrint> logger ('Д', "other text", 54)
(Д,other text,54) 
Prelude UU.PPrint> 

If you want to output to a file instead of the console, you can use the hPutDoc function to output to a handle. You could also call renderSimple to produce a SimpleDoc, then pattern match on the constructors to process output, but that's probably more trouble. Whatever you do, avoid show:

Prelude UU.PPrint> show $ pretty 'Д'
"\1044"

You could also write your own type class similar to show but formatted as you like it. The Text.Printf module can be helpful if you go this route.

John
thanks . i will try
Anton
could you get tip how pretty-printers can help me
Anton
I've added an example which should make this clear. Note that the usual way of using a pretty-printer would be to assemble all your data at once and render the document in one go. I've done this line-by-line because that's more useful for debugging; you'll get more partial output in your program crashes or hangs.
John