views:

260

answers:

1

I have a bunch of string with special escape codes that I want to store unescaped- eg, the interpreter shows

"\\014\"\\000\"\\016smoothing\"\\011mean\"\\022color\"\\011zero@\\016" but I want it to show (when inspected) as "\014\"\000\"\016smoothing\"\011mean\"\022color\"\011zero@\016"

What's the method to unescape them? I imagine that I could make a regex to remove 1 backslash from every consecutive n backslashes, but I don't have a lot of regex experience and it seems there ought to be a "more elegant" way to do it.

For example, when I puts MyString it displays the output I'd like, but I don't know how I might capture that into a variable.

Thanks!

Edited to add context: I have this class that is being used to marshal / restore some stuff, but when I restore some old strings it spits out a type error which I've determined is because they weren't -- for some inexplicable reason -- stored as base64. They instead appear to have just been escaped, which I don't want, because trying to restore them similarly gives the TypeError TypeError: incompatible marshal file format (can't be read) format version 4.8 required; 92.48 given because Marshal looks at the first characters of the string to determine the format.

require 'base64'
class MarshaledStuff < ActiveRecord::Base

  validates_presence_of :marshaled_obj

  def contents
    obj = self.marshaled_obj
    return Marshal.restore(Base64.decode64(obj))
  end

  def contents=(newcontents)
    self.marshaled_obj = Base64.encode64(Marshal.dump(newcontents))
  end
end

Edit 2: Changed wording -- I was thinking they were "double-escaped" but it was only single-escaped. Whoops!

+1  A: 

If your strings give you the correct output when you print them then they are already escaped correctly. The extra backslashes you see are probably because you are displaying them in the interactive interpreter which adds extra backslashes for you when you display variables to make them less ambiguous.

> x
=> "\\"
> puts x
\
=> nil
> x.length
=> 1

Note that even though it looks like x contains two backslashes, the length of the string is one. The extra backslash is added by the interpreter and is not really part of the string.

If you still think there's a problem, please be more specific about how you are displaying the strings that you mentioned in your question.


Edit: In your example the only thing that need unescaping are octal escape codes. You could try this:

x = x.gsub(/\\[0-2][0-7]{2}/){ |c| c[1,3].to_i(8).chr }
Mark Byers
Heh, looking at a non-broken string, you're right that they're only single-escaped--but I don't want them escaped at all!
RubyNoobie
@RubyNoobie: The interpreter will never show your second example because it isn't a valid string. Even a string of length one with just a single quote will show as `"\""` in the interpreter. I think your problem lies elsewhere.
Mark Byers
`None`? Or `nil`?
Justice
@Mark Byers -- ah, that's true. So I don't want to unescape the `"` s, just the `\\004` , etc.
RubyNoobie
@RubyNoobie: I think I understand your question now. :) I've provided a solution that will work for '\\012' => '\012', but maybe that's not enough. I'm not sure if there is a method that does this... probably there is, I just don't happen to know it.
Mark Byers
@Mark Byers -- Thanks! that appears to satisfy 'Marshal.restore'.
RubyNoobie