tags:

views:

644

answers:

4

I am trying to parse a multi line string and get the rest of the line following a pattern.

text:

hello john
your username is: jj
thanks for signing up

I want to extract jj, aka everything after "your username is: "

One way:

text = "hello john\nyour username is: jj\nthanks for signing up\n"
match = text[/your username is: (.*)/]
value = $1

But this reminds me of perl... and doesn't "read" as naturally as I am told ruby should.

Is there a cleaner way? AKA A "ruby" way?

Thanks

+4  A: 

The split command is mindbogglingly useful. It divides a string into an array of substrings, separating on whatever you pass in. If you don't give it any arguments, it splits on whitespace. So if you know the word you're looking for is the fifth "word" (splitting on both spaces and the return character), you can do this:

text = "hello john\nyour username is: jj\nthanks for signing up\n"
match=text.split[5]

..but perhaps that's not sufficiently self-documenting, or you want to allow for multi-word matches. You could do this instead:

midline=text.split("\n")[1]
match=midline.split("username is: ").last

Or perhaps this more terse way:

match=text[/username is: (.*)/,1]

glenra
i would do that, but split on " " instead of "your username is: "
Matt Briggs
+1 for an interesting use of split to get the fifth word, I hadn't noticed that.
The Wicked Flea
Wow, thanks for the multiple attack paths. ;)I hadn't noticed the split possibility, but the incoming text isn't static (the example was greatly simplified) so it won't work... but it was a nice way to attack the problem.
SWR
Matt: The only reason to split on something like "username is: " or "your username is: " is to make it a little more obvious what the code is doing. It says to the later maintenance programmer "oh, this is getting the 'username is: ' text" rather than "this is getting the last word from some line (of unknown provenance)". Though I suppose one could accomplish the same goal by naming the variables somehing better than "match" and "midline"...
glenra
+10  A: 

Your code is pretty much the Ruby way. If you don't want to use the global $1, you can use the 2 arg version String#[]:

match = text[/your username is: (.*)/, 1]
outis
Thanks. That is exactly what I was looking for! Pulling up dollar globals just seemed too old school.Went and read the API docs for Ruby, Class:String and there it was.http://www.ruby-doc.org/core/classes/String.html#M000786
SWR
+2  A: 

Not sure if it's any more Ruby'ish, but another option:

>> text = "hello john\nyour username is: jj\nthanks for signing up\n"
>> text.match(/your username is: (.*)/)[1]
=> "jj"
dbr
This is how I'd do it. It's not technically superior to the [] solution, but I think the intent is clearer when you're skimming it.
Chuck
A: 

There's also Regexp#match, which returns a MatchData object, which has all the information you could possibly want.

irb> match = /your username is: (.*)/.match "hello john\nyour username is: jj\nthanks for signing up\n"
#=> #<MatchData:0x557f94>
irb> match.pre_match
#=> "hello john\n"
irb> match.post_match
#=> "\nthanks for signing up\n"
irb> match[0]
#=> "your username is: jj"
irb> match[1]
#=> "jj"
rampion