views:

58

answers:

3

I don't really want to use curl or anything complex like that. I was looking for a simple way to check if the url (User will enter the url and it is then stored in my MySQL database) is valid and that it is a valid url i.e. the domain exists. Is fopen a good solution?

Any tutorials or tips on this would be greatly appreciated.

A: 

You should try filter_var

<?php
$url = "http://www.example.com";

if(!filter_var($url, FILTER_VALIDATE_URL))
{
  echo "URL is not valid";
}
else
{
  echo "URL is valid";
}
?> 

Read the filter_var manual

irishbuzz
This won't check that the URL actually exists, only that it is valid.
Raoul Duke
ah "the url is valid and that it is actually a url" is a little confusing as something like `http://blahblahblah.com` is actually an url. There just isn't a site for it :-)
irishbuzz
sorry, i'll ammend the original question, sorry, was a bit vague there
Callum Johnson
@Callum - no problem. I'll leave this answer up but I agree that [adam's solution](http://stackoverflow.com/questions/3700560/a-quick-question-about-url-validation/3700625#3700625) looks like what you need.
irishbuzz
+5  A: 

CURL isn't that complex. To get the http status of a url:

$ch = curl_init('http://www.yahoo.com/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_exec($ch);
$status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if($status >= 200 && $status < 400) {
    // url is valid
}

A regex or filter would validate that the url is valid but not that it exists. A DNS lookup validates that the domain exists, but not the url.

adam
+1 good point, dns lookup won't help.
Raoul Duke
thanks for the reply, do i need any specific framework (like are more files needed to make the above code work)?
Callum Johnson
You will need cURL support enabled in you PHP setup. You can check this by creating a PHP file containing `<?php phpinfo(); ?>` and then visit that page in your browser.
irishbuzz
CURL is installed on pretty much every php environment I've ever seen, although I know what they say about assumption...
adam
owh right, i'm using XAMMP for windows (migrating to using the mac version soon though) I tried CURL before and it said I had a "Fatal error: call to and undefined function" or something like that? I'll try the phpinfo thing now
Callum Johnson
thank you, I didn't have curl installed.
Callum Johnson
I have now though. Can Curl do like validation (Check if its http:) etc>??
Callum Johnson
Not sure what you mean. I'm using MAMP on Mac and it comes with CURL installed.
adam
I had to chnage the php.ini file, its been sorted now, curl is active.
Callum Johnson
Can I use cURL to check the url is actually in the url format before we check it with the cURL code you've provided?
Callum Johnson
just seen your regex note on your comment
Callum Johnson
i'm using: if(preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $url)){ //url is valid }else{ //url is not valid }. However: www.google.com seems to be considered invalid by my preg_match statement :/
Callum Johnson
`filter_var($url, FILTER_VALIDATE_URL)` will return true if `$url` is a valid url format
adam
thank you. When i enter a valid url, it display my headers in the div and say the document has moved to the specified url?
Callum Johnson
Updated the code to add RETURNTRANSFER
adam
+2  A: 

First, validate that it is a valid URL using filter_var():

<?php

$url = "http://www.example.com/foo.php";

if(!filter_var($url, FILTER_VALIDATE_URL))
{
  die('Invalid URL');
}

Next, parse the URL with parse_url() and ensure that it is HTTP(S):

$p_url = parse_url($url);
if(!$p_url) // couldn't parse URL, since parse_url() cannot recognize it
{
  die('Invalid URL');
}
if($p_url['scheme'] != 'http' && $p_url['scheme'] != 'https')
{
  die('Invalid protocol (only HTTP(S) is supported)');
}

Lastly, check that the host exists and that you can connect to it. I choose to use fsockopen() here, since it would check the hostname and port, while not actually sending an HTTP request.

$fp = fsockopen($p_url['host'], (isset($p_url['port']) ? $p_url['port'] : 80));
if($fp)
{
  echo 'Valid URL';
  fclose($fp); // Remember to close the file pointer
}else{
  echo 'Invalid server';
}

Note that you might want to avoid using this method (depending on what you want your application to do), as if the server is down, this would result in Invalid server, even though the server might exist. An alternative solution that will only check the hostname from the DNS servers, and not connect at all, is using gethostbyaddr() and gethostbyname():

if(@gethostbyaddr($p_url['host'])) // IP addresses passes this test
{
  echo 'Valid URL';
}else{
  $host = $p_url['host'];
  $addr = gethostbyname($host);
  if(!$addr) // Invalid domain name
  {
    echo 'Invalid domain name';
  }else if($host == $addr) // Domain name could not be resolved (i.e. does not exist)
  {
    echo 'Invalid domain name';
  }else{
    echo 'Valid URL';
  }
}

Links:

Frxstrem
thank you for this! This looks pretty good.
Callum Johnson