tags:

views:

712

answers:

4

Ive asked this question before, but I discovered a problem with the top answer. Its been suggested I use parse_url() to get the domain part of any URL. The problem is, the function wont work if the url doesn't start with http://

Are there any alternatives to using the parse_url function? The only I can see, is adding http:// to the url if its not there before feeding it into the function.

+1  A: 
preg_match("/^(?:[^\/]+:\/\/)?([^\/:]+)/",
 "https://www.gmail.google.com/test.html", $mat);
var_dump($mat);

Returns:

array(2) { [0]=>  string(28) "https://www.gmail.google.com" [1]=>  string(20) "www.gmail.google.com" }

:)

nlaq
Very neat! Im gonna have to test this extensively, but so far it works great!
Yegor
A: 

parse_url works perfectly fine with other protocols thant http, without changing anything. But of course, if by

The only I can see, is adding http:// to the url if its not there before feeding it into the function.

you mean that you're using somentihng like "www.foobar.com/blah", don't expect it to work. Its purpose is to parse URL, "www.foobar.com/blah" isn't one.

gizmo
Well, I have a url entry for on my site, and I need to parse anything url-like into domain.com many people will enter the url without http://
Yegor
+3  A: 

parse_url() works perfectly with all kinds of URL's.No matter whether URL has protocol in it or not. Consider example from php.net :

$url = 'http://username:password@hostname/path?arg=value#anchor';

print_r(parse_url($url));

echo parse_url($url, PHP_URL_PATH);

The main idea is to specify second parameter to function parse_url()

UPD. As of PHP 5.1.2

Stanislav
Caveat: this assumes you have the extension installed.http://www.php.net/manual/en/function.parse-url.php
dcousineau
Nope, that does not work. I have php 5.2.6if I remove the protocol, its no longer recognized as a URL and the function doesnt have any output.
Yegor
+1  A: 
$scheme = "http://";

$url = "www.yahoo.com/search/page.html";

if(empty(parse_url($url,PHP_URL_HOST))) {
  $url = $scheme . $url;
}

echo parse_url($url,PHP_URL_HOST);

So, in cases where no scheme/protocol is defined, it adds http:// in front of the string, and then parse_url() can work properly.

It's usually better to re-use existing functionality, rather than build out regex's that are a complete pain for anybody who's not a regex ninja to decipher/debug/add additional functionality to.

Josh Boyd