ansaurus

Question

Clearer way to parse a token out of a string in ruby

Answer 1

+2 A:

How about String include? or index ?

url.include? "myuserid"

Or do you want some positional thing? If so, then you could split the URL.

Yes a third thought: Using your input form with the :username thing, construct and compile a Regexp for each such string, and use Regexp#match to return a MatchData. If you kept pairs of the Regexp and the index of the :username field, you could do it directly.

Charlie Martin 2009-05-21 04:21:20

In this case I can't use regular old include? I have input "http://claimid.com/myusername" and from that I need output "myusername". The problem is the input could be something else like "http://myusername.blogspot.com" and I'd still want output "myusername". Basically finding the username part of an openid URL; but the openid URL could be anything, and might not be found.It sounds like the "third thought" is what I'm doing in the bottom example already as well? It's running through all potential strings, and getting the "username" part of each, clearing nils and returning the first one.

AdamFortuna 2009-05-21 04:43:23

Answer 2

+1 A:

I still think regular expression can be a solution here. However you need to write a code that would create a regexp out of routing-like string. An example code is:

class Router
    def initialize(routing_word)
     @routes = routing_word.scan /:\w+/
     @regex = routing_word
     @regex.gsub!('/','\\/')
     @regex = Regexp.escape(@regex)
     @regex.gsub!(/:\w+/,'(\w+)')
            @regex = '^'+@regex+'$'
     @regex = Regexp.new(@regex)
    end
    def match(url)
     matches = url.match @regex
     ar = matches.to_a[1..-1]
     h = {}
     @routes.zip(ar).each {|k,v| h[k] = v}
     return h
    end
end

r = Router.new('|:as|:sa')
puts r.match('|a|b').map {|k,v| "#{k} => #{v}\n"}

Use a router for each routing string. It should return a nice hash tables that matches URL colon-strings to actual URL components.

In order to recognize the given URL one should go through all the Routers, and find out which one accepts the given URL.

class OpenIDRoutes
 def initialize()
  routes = [
     "http://openid.aol.com/:username/",
     "http://:username.myopenid.com/",
     "http://:username.livejournal.com/",
     "http://flickr.com/photos/:username/",
     "http://technorati.com/people/technorati/:username/",
     "http://:username.wordpress.com/",
     "http://:username.blogspot.com/",
     "http://:username.pip.verisignlabs.com/",
     "http://:username.myvidoop.com/",
     "http://:username.pip.verisignlabs.com/",
     "http://claimid.com/:username/"
  ].map {|x| Router.new x}
 end

 #given a URL find out which route does it fit
 def route(url)
  for r in routes
   res = r.match url
   if res then return res
   end
 end

r = OpenIDRoutes.new
puts r.route("http://claimid.com/myusername")

I think that's a nice and easy implementation of most of rails routing.

Elazar Leibovich 2009-05-21 04:53:40

Answer 3

+4 A:

I think your best bet is to generate the regular expressions, as Elazar suggested previously. If you're just matching one field (:username) then something like this would work:

PROVIDERS = [
   "http://openid.aol.com/:username/",
   "http://:username.myopenid.com/",
   "http://:username.livejournal.com/",
   "http://flickr.com/photos/:username/",
   "http://technorati.com/people/technorati/:username/",
   "http://:username.wordpress.com/",
   "http://:username.blogspot.com/",
   "http://:username.pip.verisignlabs.com/",
   "http://:username.myvidoop.com/",
   "http://:username.pip.verisignlabs.com/",
   "http://claimid.com/:username/"
]

MATCHERS = PROVIDERS.collect do |provider|
  parts = provider.split(":username")
  Regexp.new(Regexp.escape(parts[0]) + '(.*)' + Regexp.escape(parts[1] || ""))
end

def extract_username(url)
  MATCHERS.collect {|rx| url[rx, 1]}.compact.first
end

It's very similar to your own code, only the list of providers is much cleaner, making it easier to maintain and add new providers as required.

tomafro 2009-05-21 10:31:06

Awesome, works great! Lot less code and easier to read.

AdamFortuna 2009-05-22 02:13:47

Answer 4

+1 A:

It's a little URI specific, but the standard library has URI.split():

require 'uri'

URI.split("http://claimid.com/myusername")[5] # => "/myusername"

Might be able to use that somehow.

C.J.

CJ 2009-05-21 14:06:29

ansaurus

tags:

views:

answers:

Clearer way to parse a token out of a string in ruby

related questions