views:

74

answers:

2

Hey all!

I currently have a preg_match to detect http:// and www. etc..... but I want to detect domain.com or domain.co.uk from a string

example string: "Hey hows it going, check out domain.com" And I want to detect domain.com

What I want is to detect any major domains form this string i.e. .com .co.uk .eu etc... from the form xxx.com yyy.co.uk and then return true or false to handle it. In this case it would find domain.com.

However I do NOT want it to detect something like:

"hey.i love this site"

Whereby this is obviously an error in typing a space from the full stop!

Any ideas i need to scratch up on my regex!

Thanks, Stefan

+1  A: 

After they introduced non-Latin urls, it will be close to impossible to use regex to get a completely working filter. So I'd say it's not even worth trying to use regex for this anymore. Doubt parse_url() has support for it yet either, but using it means someone else have to work out the problems with non-Latin urls, which is always a bonus :) So use that

http://au.php.net/parse_url

http://thenextweb.com/me/2010/05/06/monumental-day-internet-nonlatin-domain-names-live/

Edit: Ok, from a string, split it into words like this


$array = explode(" ", $string);

for(int i = 0; i < count($array);i++)
{
  if(parse_url($array[i]) != false)
  {
    $url[] = $array[i];
  }
}

Ok, parse_url() isn't supposed to be used like this, but there is no other function built into php to do url filtering as far as I can see.

Thomas Winsnes
A: 

Here is regexp that would match a provided list of domain zones:

[a-z0-9\-\.]+\.(com|co\.uk|net|org)
serg
Doesnt seem to work?
Stefan