views:

129

answers:

2

What's the best way to determine the native newline characters such as '\n' or '\r\n' in Haskell?

I see there is a "nativeNewline" function in GHC.IO:Handle, but assume that it is both a private API and most of all non-standard Haskell.

+7  A: 

You should think of the newline representation as part of the encoding of a text file that is stored in the filesystem, just like UTF-8. A text file is normally decoded when you read it into your program, and encoded when written -- converting to and from the native newline representation is done as part of this encoding and decoding. Inside your Haskell program, just as characters are represented by their Unicode code points, the newline character is always \n.

To tell the I/O system about the newline encoding you want to use, see the section on Newline Conversion in the documentation for System.IO.

Simon Marlow
So if i would generate a file in memory (as say Data.Text), i should use '\n' in any case, even on Windows?
Lenny222
Yes. The translation to `\r\n` will happen when you write the text to the file.
Simon Marlow
Ok, thanks Simon.
Lenny222
+2  A: 

System.IO.nativeNewline is not private - you can access it to find out what GHC considers the native "newline" to be on the current platform.

Note that the type of this variable, System.IO.Newline, does not have a Show instance as of GHC 6.12.3. So you can't easily print its value. Instead, check to see if it is equal to System.IO.LF or System.IO.CRLF.

However, as Simon pointed out, you shouldn't need to know about the native newline sequence with normal usage of the text-oriented IO functions in GHC.

This variable, together with the rest of the new Unicode-aware capabilities of the IO system, is not yet part of the Haskell standard. It was not included in the Haskell 2010 report. However, since it is already implemented in GHC, and there is quite a widespread consensus that it is important and useful, expect it to be included in one of the upcoming yearly revisions of the standard.

Yitz
Thanks for the answer. My code is pure and thus not about IO. How would you solve that within Data.Text?
Lenny222
The `nativeNewline` constant is not in the IO monad, so you can use it in pure code. It just happens to be located in a module whose name is "`System.IO`", because it is normally used in the context of reading and writing text to a file or a user-visible device. In fact, I'm not sure why you need to know about the native newline sequence if you are not exchanging any textual information with the outside world.
Yitz
To make it concrete: i am generating LateX files in memory with Data.Text. I am not sure how to separate the lines. Should i use hard-coded '\n'? I don't know what's going to happen with the in-memory files, whether it's written to a file, displayed on the screen or just the lines counted. From Simon's answer i have the impression, i can use hard-coded '\n' and Haskell's IO will automagically convert it to '\r\n' on Windows.
Lenny222