tags:

views:

77

answers:

4

Hi

I am trying to figure out the best way to do this...

Given a string

s = "if someBool || x==1 && y!=22314" 

I'd like to use Ruby to seperate statements and boolean operators.. so I'd like to split this into

["if","someBool","||","x","==","1","&&","y","!=","22314"]

I could use s.split(), but this only splits with space as delimeters..but I'd like x!=y to be split too (they are valid boolean sentences, they just dont have space in between for good readability). Of course the easiest way is to require the user to put space between boolean operator and variables, but are there any other way to do this?

+1  A: 

You can get split to split on anything you want, including a regex. Something like:

s.split( /\s|==|!=/ )

...might be a start.

Disclaimer: regexen make my head hurt. I've tested it now, and it works against your example.


UPDATE: No it doesn't. split always skips what it splits on, so the above code loses the == and != from your example. (Monoceres' code works fine.)

But for some reason if you enclose the split term in the regex in brackets, it keeps the thing in the answer array instead of just splitting on it. I don't know if this is a bug, a feature, or some clever bit of design I don't appreciate properly.

So in fact you need:

s.split( /\s|(==)|(!=)/ )

But this is hardly code that explains itself. And for all I know it doesn't work in 1.9.

Shadowfirebird
+1  A: 

Something like this works:

s = "12&&32 || 90==12 !=67"
a = s.split(/ |(\|\|)|(&&)|(!=)|(==)/)
a.delete("")
p a

For some reason "" remained in the array, the delete line fixed that.

monoceres
+1  A: 

My rule of thumb: use split if you know what to throw away (the delimiters), use a regex if you know what to keep. In this case you know what to keep (the tokens), so:

s.scan(/ \w+ | (?: \s|\b )(?: \|\| | && | [=!]= )(?: \s|\b ) /x)
# => ["if", "someBool", "||", "x", "==", "1", "&&", "y", "!=", "22314"]

The (?: \s|\b ) "delimiters" are to prevent your tokens (e.g. ==) from matching something you don't want (e.g. !==)

glenn jackman
I think scan is more c-like and probably more efficient and I'd like to stick to it too
@jocapco, if "c-like" is important, why Ruby?
glenn jackman
its not important, its just goodlooking :) .. ruby because what I am doing requires the user to know very little and do a lot .. using script language like ruby would be preferable then. Im only still doing small scale stuffs.
+2  A: 

Split on whitespace or a word boundary:

s = "if someBool || x==1 && y!=22314"
a = s.split( /\s+|\b/ );
p a

Output:

["if", "someBool", "||", "x", "==", "1", "&&", "y", "!=", "22314"]
FM
Yep, that does it for me. Nice.
Shadowfirebird