tags:

views:

114

answers:

2

I'm a little stumped on this one and I'm also not still on 1.8 so I don't have lookahead.

I have a bunch of strings which can look like:

"a/b/c/d/e/f 1/2/3"

which I want to turn into:

"a/b/c/d/e" "f" "1/2" "3"

So basically I want it to split by the last slash before the beginning of whitespace. I feel like I can do this normally but split always seems to do weird things.

+1  A: 
def foo s
   return [$1,$2] if s =~ /(.+)\/(\S)/
end

str = "a/b/c/d/e/f 1/2/3"
a = str.split /\s+/
a.collect { |e| foo e }.flatten

=> ["/a/b/c/d/e", "f", "1/2", "3"]

I broke down the split and collect. You could, of course, shorten this as needed.

ezpz
Thanks that works well. out of curiosity is it not possible to do it with regex alone?
Stacia
Yes. See mine...
glenn mcdonald
As glenn pointed out this is entirely possible. I have a personal fear of the overuse of regex, however. Having had to maintain, update, and revisit such constructs I always opt to use them only for what is needed - and even then I try to keep them simple.
ezpz
A fair point, but to ponder it just a little further, in this case I think your version is not materially more straightforward from the perspective of a later code-maintainer encountering it. At least with the single-regexp version it's clear that the goal is to split the string, and if it fails, you know you have to fix this one regexp.
glenn mcdonald
[continued] Your way it's not immediately obvious whether the split and collect bits are doing two parts of one conceptual action, or two different things, and if the string changes format and things break you won't immediately know whether the problem is in the split, the split's regexp, foo's regexp, or foo's returning of [$1,$2].
glenn mcdonald
ezpz
[continued] In 1.8, the implementation is a `case` statement with logical progression between states that can occur. In 1.9 there is a jumbo regular expression that handles most of the logic. The 1.9 version is broken. I just want to use `shellwords`, but in order to do that in 1.9 I have to spend several hours to break down that massive state machine and figure out how to add the one state that is missed. Making things worse, I didnt write the original regex so I have no working knowledge of its construction. I'm still using 1.8.
ezpz
+2  A: 

1.8 lacks look*behind*, not look*ahead*! All you need is this:

str.split(/\/(?=[^\/]+(?: |$))| /)

This split pattern matches a) any slash that is followed by non-slash characters up to the next space or the end of the string, and b) any space.

glenn mcdonald