views:

40

answers:

3

I have come across a problem that I cannot see to solve. I have extracted a line from a web page into a variable. lets say for argument sake this is:

rhyme = "three blind mice Version 6.0"

and I want to be able to first of all locate the version number within this string (6.0) and secondly extract this number into another seperate variable - (I want to specifically extract no more than "6.0")

I hope I have clarified this enough, if not please ask me anything you need to know and I will get back to you asap.

A: 
if rhyme =~ /(\d\.\d)/
    version = $1
end

The regexp matches a digit, followed by a period, followed by another digit. The parenthesis captures its contents. Since it is the first pair of parenthesis, it is mapped to $1.

Magnar
looks like Perl.
Mark Thomas
+2  A: 

First you need to decide what the pattern for a version number should be. One possibility would be \d+(\.\d+)*$ (a number followed by zero or more (dot followed by a number) at the end of the string).

Then you can use String#[] to get the substring that matches the pattern:

rhyme[ /\d+(\.\d+)*$/ ] #=> "6.0"
sepp2k
Your regex misses the 6. It should be /(\d+\.\d+)/.
Gerhard
@Gerhard: Nope. My regex works as is. If you run my code in irb, you'll see that it does indeed return "6.0" like I said. Also unlike your proposed regex, mine would also match "6" or "6.0.0".
sepp2k
@Gerhard: To make it work with `scan`, you need to make the group non-capturing (and remove the anchor of course). Like this: `"1.0 2.0.0 3".scan(/\d+(?:\.\d+)*/) #=> ["1.0", "2.0.0", "3"]`
sepp2k
@Gerhard: That being said, I anchored the regex specifically because I assumed that the OP would not want stray numbers appearing in the name to affect the result, so I wouldn't recommend using scan. I.e. I think for "Sepp2k's base 2 to base 16 converter Version 42" it should just return 42 and ignore the 2 and the 16.
sepp2k
@sepp2K: I checked it in irb and rubular and it works. But what is the purpose of the brackets. Would `/\d+\.\d+/` not be sufficient?
Gerhard
@Gerhard: As I said, my regex also matches "1.0.0" or "1", while `/\d+\.\d+/` would only match "1.0".
sepp2k
Ok. () is not just capture group but also groupings. Your Regex Fu exceeds mine!
Gerhard
got it to work with this one rhyme[ /\d+(\.\d+)*$/ ] #=> "6.0" - nice simple and easy! Thanks!
+1  A: 

You need to use regular expressions. I would use rhyme.scan(/(\d+\.\d+)/) since it can return an array if multiple matches occur. It can also take a block so that you can add range checks or other checks to ensure the right one is captured.

version = "0.0"
rhyme = "three blind mice Version 6.0"
rhyme.scan(/(\d+\.\d+)/){|x| version = x[0] if x[0].to_f < 99}
p version

If the input can be trusted to yield only one match or if you always are going to use the first match you can just use the solution in this answer.

Edit: So after our discussion just go with that answer.

Gerhard