ansaurus

Question

regex to get the domain name from a url in ruby

Answer 1

+1 A:

Don't know much about ruby but this regex pattern gives you the last 3 parts of the url excluding the trailing slash with a minumum of 2 characters per part.

([\w-]{2,}\.[\w-]{2,}\.[\w-]{2,})/$

Fabian 2010-07-24 08:46:41

Should be `([\w-]{2,}\.[\w-]{2,}\.[\w-]{2,})\/$`. +1 though.

Sarfraz 2010-07-24 08:55:29

What about (?<=//)[^/]+

SchlaWiener 2010-07-24 09:09:29

Answer 2

+5 A:

URI.parse('http://www.abc.google.com/').host
#=> "www.abc.google.com"

Not a regex, but probably more robust then anything we come up with here.

URI.parse('http://www.abc.google.com/').host.gsub(/^www\./, '')

If you want to remove the www. as well this will work without raising any errors if the www. is not there.

Squeegy 2010-07-24 08:50:37

i want to remove the www. too

railscoder 2010-07-24 08:55:44

Answer 3

A:

Jörg W Mittag 2010-07-24 08:57:59

i might have framed the qn wrongly. what am trying to do is just remove the leading "http://www." and evering thing after .comso given "http://www.google.com/" should give google.com"http://www.abc.google.com/" should return abc.google.com

railscoder 2010-07-24 09:01:37

Why do you want to get abc.google.com for http://abc.google.com/ but google.com for http://www.google.com/ ? What makes the 'www' special? It is just a convention that http-servers usually are on the host named www but it don't have to be that way.

SchlaWiener 2010-07-24 09:07:03

yeah. i use a webservice which strips of http and www part of the sitename. to compare the results i need to do the same before doing it

railscoder 2010-07-24 09:18:09

ansaurus

tags:

views:

answers:

regex to get the domain name from a url in ruby

related questions