ansaurus

Question

ruby regular expression and extraction from string

Answer 1

A:

Maybe with /\/to\/[^\/]*\/(.*)\/t/ :

"http://linkto.com/to/1pyTZl/somesite.com/2009/10/monit-on-ubuntu/t" =~ /\/to\/[^\/]*\/(.*)\/t/
puts $1

-> somesite.com/2009/10/monit-on-ubuntu

nacmartin 2009-12-08 20:58:23

Answer 2

A:

/to/\w+/(.*?)/t

Rubens Farias 2009-12-08 20:58:42

Answer 3

+2 A:

/\/to\/\w+\/(.*)\/t/i

A great resource is Rubular. It allows you to test your expression against inputs and see the matches.

Erik Nedwidek 2009-12-08 21:04:02

Rubular tool is a good tool, I like it.

Steve Zhang 2009-12-09 01:50:39

suffers from leaning toothpick syndrome. use `%r` to choose different delimiters

glenn jackman 2009-12-09 11:56:29

Answer 4

+5 A:

That string looks like it's actually not a string but a URI. So, let's treat it as one:

require 'uri'
uri = URI.parse(str)

Now, extracting the path component of the URI is a piece of cake:

path = uri.path

Now we have already greatly limited the amount of stuff that can go wrong with our own parsing. The only part of the URI we still have to deal with, is the path component.

A Regexp that matches the part you are interested in looks like this:

%r|/to/\w+/(.*/)t$|i

If we put all of that together, we end up with something like this:

require 'uri'

def URI.extract(uri)
  return parse(uri).path[%r|/to/\w+/(.*/)t$|i, 1]
end

require 'test/unit'
class TestUriExtract < Test::Unit::TestCase
  def test_that_the_path_gets_extracted_correctly
    uri  = 'http://linkto.com/to/1pyTZl/somesite.com/2009/10/monit-on-ubuntu/t'
    path = 'somesite.com/2009/10/monit-on-ubuntu/'
    assert_equal path, URI.extract(uri)
  end
end

Jörg W Mittag 2009-12-08 23:22:18

upvoted for using behavior-driven answering :)

Adrian 2009-12-09 12:32:56

Answer 5

+2 A:

Answers so far a right, but you should make sure the trailing /t is really at the end of the string using the $ wildcard

regex = %r(/to/[^/]+/(.*)/t$)
'http://linkto.com/to/1pyTZl/somesite.com/2009/10/monit-on-ubuntu/t' =~ regex
puts $1

Adrian 2009-12-08 23:35:46

Thanks. I updated my answer accordingly.

Jörg W Mittag 2009-12-09 09:50:25

Answer 6

A:

s = "http://linkto.com/to/1pyTZl/somesite.com/2009/10/monit-on-ubuntu/t"
puts s[/to\/.+?\/(.*)\/t$/, 1]
=> somesite.com/2009/10/monit-on-ubuntu

jhickner 2009-12-09 01:06:38

ansaurus

tags:

views:

answers:

ruby regular expression and extraction from string

related questions