views:

437

answers:

2

I have a Unicode file (UTF-16 FFFE little-endian BOM) which contains rows of tab-separated fields.

Read http://stackoverflow.com/questions/2308112/splitting-unicode-i-think-using-split-in-ruby, I am going to use the Ruby split (file to lines, then line to fields).

BTW, what's the Unicode char for:

  • LF
  • CR
  • Tab

Thanks!

+3  A: 
LF:  U+000A  
CR:  U+000D  
Tab: U+0009  

http://en.wikipedia.org/wiki/List_of_Unicode_characters

Michael Petrotta
+2  A: 

Unicode TAB is u0009. LF is u000a and CR is u000d

Same as ASCII actually.

Ayman
Simply because the first 256 code points of Unicode are the same as in Latin-1. Which in turn uses ASCII for the first 128.
Joey