views:

56

answers:

2

I use matching URLs in a part of my code. Now I use regular expressions for this. This is fine, but does not always produces "nice", simply readable patterns. Is there any language defined for matching URLs? It should be like this: http://*.example.com/* so simply wild-cards and things useful for URL would be there.

The best would if these expression can be simply transformed to regexp. Do you know specification for such a language, or even an implementation, preferably for ruby.. otherwise I implement it myself... the key is the readability of the patterns. Thanks for help!

A: 

You'd have to carefully work your syntax out before you begin. At first glance, what you intend would be easily achieved by translating your syntax into ordinary regexes:

s = 'http://*.example.com/*' #=> "http://*.example.com/*"
r = Regexp.compile("^#{Regexp.escape(s).gsub('\*','.*')}$") #=> http:examplecom
'http://test.example.com/path/to/doc.html' =~ r #=> 0
'http://test.example2.com/path/to/doc.html' =~ r #=> nil
Mladen Jablanović
A: 

URLs can be a bit tricky to parse correctly, especially if you want to be standards compliant. This is why Ruby has the uri builtin library.

For a more advanced parsing library with placeholders like you want, you should look into the addressable gem.

Marc-André Lafortune