tags:

views:

5123

answers:

5

I thought this code would work, but the regular expression doesn't ever match the \r\n. I have viewed the data I am reading in a hex editor and verified there really is a hex D and hex A pattern in the file.

I have also tried the regular expressions /\xD\xA/m and /\x0D\x0A/m but they also didn't match.

This is my code right now:

   lines2 = lines.gsub( /\r\n/m, "\n" )
   if ( lines == lines2 )
       print "still the same\n"
   else
       print "made the change\n"
   end

In addition to alternatives, it would be nice to know what I'm doing wrong (to facilitate some learning on my part). :)

+5  A: 
lines2 = lines.split.join("\n")
Cameron Price
+1  A: 

How about the following?

irb(main):003:0> my_string = "Some text with a carriage return \r"
=> "Some text with a carriage return \r"
irb(main):004:0> my_string.gsub(/\r/,"")
=> "Some text with a carriage return "
irb(main):005:0>

Or...

irb(main):007:0> my_string = "Some text with a carriage return \r\n"
=> "Some text with a carriage return \r\n"
irb(main):008:0> my_string.gsub(/\r\n/,"\n")
=> "Some text with a carriage return \n"
irb(main):009:0>
mwilliams
also, I checked: "\r\n" != "\n". So it looks like the original posters code is right.
rampion
+4  A: 

Generally when I deal with stripping \r or \n, I'll look for both by doing something like

lines.gsub(/\r\n?/, "\n");

I've found that depending on how the data was saved (the OS used, editor used, Jupiter's relation to Io at the time) there may or may not be the newline after the carriage return. It does seem weird that you see both characters in hex mode. Hope this helps.

localshred
+7  A: 

What do you get when you do puts lines? That will give you a clue.

By default File.open opens the file in text mode, so your \r\n characters will be automatically converted to \n. Maybe that's the reason lines are always equal to lines2. To prevent Ruby from parsing the line ends use the rb mode:

C:\> copy con lala.txt
a
file
with
many
lines
^Z

C:\> irb
irb(main):001:0> text = File.open('lala.txt').read
=> "a\nfile\nwith\nmany\nlines\n"
irb(main):002:0> bin = File.open('lala.txt', 'rb').read
=> "a\r\nfile\r\nwith\r\nmany\r\nlines\r\n"
irb(main):003:0>

But from your question and code I see you simply need to open the file with the default modifier. You don't need any conversion and may use the shorter File.read.

Romulo A. Ceccon
A: 

Thanks everyone for your answers. Romulo's note about opening the file in binary mode was what I was looking for and helped make my expression work.

Jeremy Mullin