views:

401

answers:

4

I don't understand this Ruby code:

>> puts '\\ <- single backslash'
# \ <- single backslash

>> puts '\\ <- 2x a, because 2 backslashes get replaced'.sub(/\\/, 'aa')
# aa <- 2x a, because two backslashes get replaced

so far, all as expected. but if we search for 1 with /\\/, and replace with 2, encoded by '\\\\', why do we get this:

>> puts '\\ <- only 1 ... replace 1 with 2'.sub(/\\/, '\\\\')
# \ <- only 1 backslash, even though we replace 1 with 2

and then, when we encode 3 with '\\\\\\', we only get 2:

>> puts '\\ <- only 2 ... 1 with 3'.sub(/\\/, '\\\\\\')
# \\ <- 2 backslashes, even though we replace 1 with 3

anyone able to understand why a backslash gets swallowed in the replacement string? this happens on 1.8 and 1.9.

+1  A: 

argh, right after I typed all this out, I realised that \ is used to refer to groups in the replacement string. I guess this means that you need a literal \\ in the replacement string to get one replaced \. To get a literal \\ you need four \s, so to replace one with two you actually need eight(!).

# Double every occurrence of \. There's eight backslashes on the right there!
>> puts '\\'.sub(/\\/, '\\\\\\\\')

anything I'm missing? any more efficient ways?

Peter
I think you are correct. But Welch's way seems better for me.
pierr
+3  A: 

This is an issue because backslash (\) serves as an escape character for Regexps and Strings. You could do use the special variable \& to reduce the number backslashes in the gsub replacement string.

foo.gsub(/\\/,'\&\&\&') #for some string foo replace each \ with \\\

EDIT: I should mention that the value of \& is from a Regexp match, in this case a single backslash.

Also, I thought that there was a special way to create a string that disabled the escape character, but apparently not. None of these will produce two slashes:

puts "\\"
puts '\\'
puts %q{\\}
puts %Q{\\}
puts """\\"""
puts '''\\'''
puts <<EOF
\\
EOF
sanscore
hmmm, interesting approach. a bit less 'pure', since if you have a more complex search it won't work. but definitely fewer characters...
Peter
+1  A: 

the pickaxe book mentions this exact problem, actually. here's another alternative (from page 130 of the latest edition)

str = 'a\b\c'               # => "a\b\c"
str.gsub(/\\/) { '\\\\' }   # => "a\\b\\c"
Peter
+2  A: 

Clearing up a little confusion on the author's second line of code.

You said:

>> puts '\\ <- 2x a, because 2 backslashes get replaced'.sub(/\\/, 'aa')
# aa <- 2x a, because two backslashes get replaced

2 backslashes aren't getting replaced here. You're replacing 1 escaped backslash with two a's ('aa'). That is, if you used .sub(/\\/, 'a'), you would only see one 'a'

'\\'.sub(/\\/, 'anything') #=> anything
macek
sorry, absolutely right. that was more of a typo then a misunderstanding.
Peter