tags:

views:

34

answers:

2

I'm passing through a variety of URLs in a global variable called target_passthrough, so the URL of a page might look like: http://www.mysite.com/index.php?target_passthrough=example.com

Or something like that. Formats for that variable may be a variety of things such as (minus quotes):

  1. "www.example.com"
  2. ".example.com"
  3. "example.com"
  4. "http://www.example.com"
  5. ".example.com/subdir/"
  6. ".example.com/subdir/page.php"
  7. "example.com/subdir/page.php"

Please note how some of those have periods as the first character such as 2,5, and 6.

Now, what I am trying to do is pull out just "example.com" from any of those possible scenarios with PHP and store it to a variable to echo out later. I tried parse_url but it gives me the "www" when that is present, which I do not want. In instances where the url is just "example.com" it returns a null value.

I don't really know how to do regex matching or if that is even what I need so any guidance would be appreciated--not really that advanced at php.

+2  A: 

As you pointed out, you can use parse_url to do much of the work for you and then simply strip off the www or leading dot if it is present.

An alternative strategy of taking the last two "words" won't always work because there are domains like www.example.co.uk. Using this strategy would give you co.uk instead of example.co.uk. There is no simple rule for determining which parts are the domain or the sub-domain.

Mark Byers
If you really really really want to just get the last two, just do a split on the '.', but I would have to agree with OP.
baens
A: 

parse_url() outputs an array the different parts of the URL. You are getting null values because you are only referencing the first item in the array. parse_url()

 Array (
        [scheme] => http
        [host] => hostname
        [user] => username
        [pass] => password
        [path] => /path
        [query] => arg=value
        [fragment] => anchor 
         )
jon3laze