tags:

views:

42

answers:

3

I am having quite the difficulty using regex in ruby to split a string along several delimiters these delimiters are:

  • ,
  • /
  • &
  • and

each of these delimiters can have any amount of white space on either side of the delimiter but each item can contain a valid space. a great example that I've been testing against is the string 1, 2 /3 and 4 12

what I would like is something around the lines of "1, 2 /3 and 4 12".split(regex) =>["1", "2", "3", "4 12"]

The closest I've been able to get is /\s*,|\/|&|and \s*/ but this generates ["1", " 2 ", "3 ", "4 12"] instead of the desired results.

Realize this is very close and I could simply all trim on each item, but being so close and knowing it can be done is sort of driving me mad. Hopefully someone can help me keep the madness at bay.

+3  A: 
/\s*,|\/|&|and \s*/

This parses as /(\s*,)|\/|&|(and \s*)/. I.e. the leading \s* only applies to the comma and the trailing \s* only applies to "and". You want:

/\s*(,|\/|&|and )\s*/

Or, to avoid capturing:

/\s*(?:,|\/|&|and )\s*/
sepp2k
I KNEW I was close! I was not familiar with the `?:` and I guess that was the trick. Does not seam to return the correct results without `?:`. Thanks, now I need to go look up what ?: does
Apeiron
@Apeiron: `?:` just makes the group non-capturing. In this case that means that the part matched by the parens will not show up in the result of `split`.
sepp2k
+1  A: 

Try .scan:

irb(main):030:0> "1, 2 /3 and 4 12".scan(/\d+(?:\s*\d+)*/)
=> ["1", "2", "3", "4 12"]
Nakilon
+1  A: 

You can try:

(?:\s*)[,\/](?:\s*)|(?:\s*)and(?:\s*)

But as Nakilon suggested, you may have better luck with scan instead of split.

tinifni