views:

107

answers:

5

I have special strings like name1="value1" name2='value2'. Values can contain whitespaces and are delimited by either single quotes or double quotes. Names never contain whitespaces. name/value pairs are separated by whitespaces.

I want to parse them into a list of name-value pairs like this

string.magic_split() => { "name1"=>"value1", "name2"=>"value2" }

If Ruby understood lookaround assertions, I could do this by

string.split(/[\'\"](?=\s)/).each do |element|
    element =~ /(\w+)=[\'\"](.*)[\'\"]/
    hash[$1] = $2
end

but Ruby does not understand lookaround assertions, so I am somewhat stuck.

However, I am sure that there are much more elegant ways to solve this problem anyway, so I turn to you. Do you have a good idea for solving this problem?

+1  A: 

This is not a complete answer, but Oniguruma, the standard regexp library in 1.9 supports lookaround assertions. It can be installed as a gem if you are using Ruby 1.8.x.

That said, and as Sorpigal has commented, instead of using a regexp I would be inclined to iterate through the string one character at a time keeping track of whether you are in a name portion, when you reach the equals sign, when you are within quotes and when you reach a matched closing quote. On reaching a closing quote you can put the name and value into the hash and proceed to the next entry.

mikej
Ruby on OSX is stuck at 1.8.7 at the moment. (I know I could update it manually, but I don't want to run into compatibility issues with the XCode tools etc.)
BastiBechtold
+1  A: 

Have a try with : /[='"] ?/

I don't know Ruby syntax but here is a Perl script you could translate

#!/usr/bin/perl 
use 5.10.1;
use warnings;
use strict;
use Data::Dumper;

my $str =  qq/name1="val ue1" name2='va lue2'/;

my @list = split/[='"] ?/,$str;
my %hash;
for (my $i=0; $i<@list;$i+=3) {
  $hash{$list[$i]} = $list[$i+2];
}
say Dumper \%hash;

Output :

$VAR1 = {
          'name2' => 'va lue2',
          'name1' => 'val ue1'
        };
M42
+1  A: 
class String

  def magic_split
    str = self.gsub('"', '\'').gsub('\' ', '\'\, ').split('\, ').map{ |str| str.gsub("'", "").split("=") }
    Hash[str]
  end

end
jordinl
+4  A: 

This fails on values like '"hi" she said', but it might be good enough.

str = %q(name1="value1" name2='value 2')
p Hash[ *str.chop.split( /' |" |='|="/ ) ]
#=> {"name1"=>"value1", "name2"=>"value 2"}
steenslag
nice, you can avoid the chop if you do instead split( /' |" |='|="|'$|"$/ )
jordinl
Wow, that is a cool solution. Thanks! Could you explain what the `*` before `str` does?
BastiBechtold
The result from the `str.chop.split` is an array. the `*` converts the elements of this array into multiple parameters to be passed to the `[]` method of `Hash`.
mikej
Interesting, I never realized the splat operator had such low precedence that it would apply to the entire expression.
Mark Thomas
+1  A: 

This should do it for you.

 class SpecialString
   def self.parse(string)
     string.split.map{|s| s.split("=") }.inject({}) {|h, a| h[a[0]] = a[1].gsub(/"|'/, ""); h }
   end
 end
Adam Tanner