tags:

views:

69

answers:

4

Hi all, I am trying to extract strings using regexp. For example in the following string:

select DESCENDANTS([Customer].[Yearly Income],,LEAVES) on axis(0),
       DESCENDANTS([Sales Territory].[Sales Territory],,LEAVES) on axis(1),
       DESCENDANTS([Customer].[Total Children],,LEAVES) on axis(2)
  from [Adventure Works]
 where [Measures].[Internet Sales Amount]

I want to extract the substring between every pair of "DESCENDANTS(" and ",,".

So the result in this case would be: [Customer].[Yearly Income], [Sales Territory].[Sales Territory], [Customer].[Total Children]

Any help is appreciated. Thanks in advance.

+4  A: 

If you have your text in a string called query you can do:

query.scan(/DESCENDANTS\((.+),,/).flatten
=> ["[Customer].[Yearly Income]", "[Sales Territory].[Sales Territory]",
"[Customer].[Total Children]"]

Some notes:

  • \( matches the literal open bracket
  • (.+) remembers the characters between the open bracket and the two commas as a capture
  • If the regexp contains captures () then scan will return an array of arrays of the captured parts for each match. In this case there is only 1 capture per match so flatten can be used to return a single array of all the matches we are interested in.
mikej
+1  A: 
/DESCENDANTS\(([^,]+),,/

See it on rubular

Aillyn
that will exclude descendants that happen to contain commas.
glenn jackman
A: 

Here's an uglier variation that uses split: split on "DESCENDANTS(" and ",,", and take every other substring:

s.split(/DESCENDANTS\(|,,/).each_with_index.inject([]) {|m,(e,i)| m << e if i.odd?; m}
glenn jackman
A: 

.+? is more safe, it works correctly if SQL is in one line.

query.scan(/DESCENDANTS\((.+?),,/).flatten
zzzhc