tags:

views:

80

answers:

3

This seems like it should be fairly simple, but for some reason I can't think of the right way to do this:

I have a string h that looks something like one(two(three four) five six) seven.

I'd like to split this up into an array of hashes so that the output is something like

{'one' => 
       {'two' => 
              {'three' => nil, 'four' => nil},
        'five'=>nil, 'six'=>nil
       }, 'seven'=>nil}

We can assume that there are equal numbers of parenthesis.

Is there any easy way to do this? In a language that encourages use of for looks, this would be relatively simple; I don't think I've been using Ruby long enough to get a feel for the Ruby way of doing this sort of problem.

Thanks!

+1  A: 

Without any context it's difficult to give you anything that might work in a more general case.

This code will work for your specific example, just using regular expressions and eval, but I would hate to use code like this in practice.

For more complex parsing of strings you might look into using http://treetop.rubyforge.org/ or similar. But then you're getting into the territory of writing your own language.

h = "one(two(three four) five six) seven"

s = h.tr "()", "{}"
s = "{#{s}}"
s = s.gsub /(\w+)/, '"\1" =>'
s = s.gsub /\>\s\"+/, '> nil, "'
s = s.gsub /\>\}+/, '> nil },'
s = s[0..-2]

puts h
r = eval(s)
puts r.inspect
puts r.class.name

Was there some concrete example that you were trying to get an answer to?

Also, I might add that you can make your life much easier if you are able to provide strings which map more naturally to being parsed by Ruby. Obviously this depends on whether you have control of the source.

Joc
+1  A: 

Here is a recursive solution:

def f(str)
  parts = ['']
  nesting_level = 0
  str.split('').each do |c|
    if c != ' ' or nesting_level > 0
      parts.last << c
    end
    if [' ', ')'].include?(c) and nesting_level == 0
      parts << ''
    end
    case c
    when '('
      nesting_level += 1
    when ')'
      nesting_level -= 1
    end
  end
  hash = {}
  parts.each do |seg|
    unless seg.include?('(')
      hash[seg] = nil
    else
      key = seg[/^[^\(\) ]+/]
      value = seg[(key.length + 1)..(seg.length - 2)].to_s
      hash[key] = f value
    end
  end
  hash
end

f 'one(two(three four) five six) seven' #=> {"one"=>{"two"=>{"three"=>nil, "four"=>nil}, "five"=>nil, "six"=>nil}, "seven"=>nil}
Adrian
A: 

Using nested regex groups. Not as performant as a parser/scanner, since this will re-scan subgroups during the recursive call.

def hash_from_group(str)
    ret = {}
    str.scan(/
        (?<key_name>\w+)
        (?<paren_subgroup>
            \(
                (?:
                    [^()]
                    |
                    \g<paren_subgroup>
                )*  # * or + here, depending on whether empty parens are allowed, e.g. foo(bar())
            \)
        )? # paren_subgroup optional
    /x) do
        md = $~
        key,value = md[:key_name], md[:paren_subgroup]
        ret[key] = value ? hash_from_group(value) : nil
    end
    ret
end


p hash_from_group('one(two(three four) five six) seven') # => {"one"=>{"two"=>{"three"=>nil, "four"=>nil}, "five"=>nil, "six"=>nil}, "seven"=>nil}
jason.rickman