ansaurus

Question

Answer 1

A:

I found this answer here:

if(($twitter_XML_raw=file_get_contents($timeline))==false){
    // Retrieve HTTP status code
    list($version,$status_code,$msg) = explode(' ',$http_response_header[0], 3);

    // Check the HTTP Status code
    switch($status_code) {
        case 200:
                $error_status="200: Success";
                break;
        case 401:
                $error_status="401: Login failure.  Try logging out and back in.  Password are ONLY used when posting.";
                break;
        case 400:
                $error_status="400: Invalid request.  You may have exceeded your rate limit.";
                break;
        case 404:
                $error_status="404: Not found.  This shouldn't happen.  Please let me know what happened using the feedback link above.";
                break;
        case 500:
                $error_status="500: Twitter servers replied with an error. Hopefully they'll be OK soon!";
                break;
        case 502:
                $error_status="502: Twitter servers may be down or being upgraded. Hopefully they'll be OK soon!";
                break;
        case 503:
                $error_status="503: Twitter service unavailable. Hopefully they'll be OK soon!";
                break;
        default:
                $error_status="Undocumented error: " . $status_code;
                break;
    }

Essentially, you use the "file get contents" method to retrieve the URL, which automatically populates the http response header variable with the status code.

Ross 2009-01-03 00:55:07

Interesting -- I'd never heard of that magic global before. http://php.net/manual/en/reserved.variables.httpresponseheader.php

Frank Farmer 2009-09-29 23:06:37

Answer 2

+10 A:

If you are using PHP's curl bindings, you can check the error code using curl_getinfo as such:

$handle = curl_init($url);
curl_setopt($handle,  CURLOPT_RETURNTRANSFER, TRUE);

/* Get the HTML or whatever is linked in $url. */
$response = curl_exec($handle);

/* Check for 404 (file not found). */
$httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
if($httpCode == 404) {
    /* Handle 404 here. */
}

curl_close($handle);

/* Handle $response here. */

strager 2009-01-03 00:56:06

I'm not familiar with cURL yet, so I'm missing a few concepts. What do I do with the $response variable down below? What does it contain?

2009-01-03 01:09:50

@bflora, I made a mistake in the code. (Will fix in a second.) You can see the documentation for curl_exec on PHP's site.

strager 2009-01-03 01:24:15

@bflora $response will contain the content of the $url so you can do additional things like checking the content for specific strings or whatever. In your case, you just care about the 404 state, so you probably do not need to worry about $response.

Beau Simensen 2009-01-03 01:42:39

Interesting. Right now I'm using $html = new DOMDocument(); @$html->loadHTMLFile($url); $xml = simplexml_import_dom($html);To get the contents of the URLs and step through them to get the elements I need to pull in. Would curl be better?

2009-01-03 02:17:48

@bflora, If you send a request to the server, it will process your request and return an HTTP code along with the data. If you request twice, your script is about twice as slow (I/O is the slowest part, usually). If you use the data you received on the first request, it'd be faster.

strager 2009-01-03 02:24:42

@bflora, Also, there's some option in PHP which disallows you to fopen() a URL (and DOMDocument probably uses fopen() in loadHTMLFile()). curl is superior, and it allows for much more configurability (e.g. you can ask for the response to be compressed, or in another language).

strager 2009-01-03 02:25:59

Answer 3

+6 A:

As strager suggests, look into using cURL. You may also be interested in setting CURLOPT_NOBODY with curl_setopt to skip downloading the whole page (you just want the headers).

Beau Simensen 2009-01-03 00:59:16

+1 for mentioning me^W^Wproviding a more efficient alternative, in the case where only the header needs to be checked. =]

strager 2009-01-03 01:04:09

Answer 4

A:

this is just and slice of code, hope works for you

            $ch = @curl_init();
            @curl_setopt($ch, CURLOPT_URL, 'http://example.com');
            @curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1");
            @curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
            @curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
            @curl_setopt($ch, CURLOPT_TIMEOUT, 10);

            $response       = @curl_exec($ch);
            $errno          = @curl_errno($ch);
            $error          = @curl_error($ch);

                    $response = $response;
                    $info = @curl_getinfo($ch);
return $info['http_code'];

2009-01-03 01:01:01

Answer 5

+4 A:

If your running php5 you can use:

$url = 'http://www.example.com';
print_r(get_headers($url, 1));

Alternatively with php4 a user has contributed the following:

This is a modified version of code from "stuart at sixletterwords dot com", at 14-Sep-2005 04:52. This version tries to emulate get_headers() function at PHP4. I think it works fairly well, and is simple. It is not the best emulation available, but it works.

Features:
- supports (and requires) full URLs.
- supports changing of default port in URL.
- stops downloading from socket as soon as end-of-headers is detected.

Limitations:
- only gets the root URL (see line with "GET / HTTP/1.1").
- don't support HTTPS (nor the default HTTPS port).

<?php
if(!function_exists('get_headers'))
{
    function get_headers($url,$format=0)
    {
        $url=parse_url($url);
        $end = "\r\n\r\n";
        $fp = fsockopen($url['host'], (empty($url['port'])?80:$url['port']), $errno, $errstr, 30);
        if ($fp)
        {
            $out  = "GET / HTTP/1.1\r\n";
            $out .= "Host: ".$url['host']."\r\n";
            $out .= "Connection: Close\r\n\r\n";
            $var  = '';
            fwrite($fp, $out);
            while (!feof($fp))
            {
                $var.=fgets($fp, 1280);
                if(strpos($var,$end))
                    break;
            }
            fclose($fp);

            $var=preg_replace("/\r\n\r\n.*\$/",'',$var);
            $var=explode("\r\n",$var);
            if($format)
            {
                foreach($var as $i)
                {
                    if(preg_match('/^([a-zA-Z -]+): +(.*)$/',$i,$parts))
                        $v[$parts[1]]=$parts[2];
                }
                return $v;
            }
            else
                return $var;
        }
    }
}
?>

Both would have a result similar to:

Array
(
    [0] => HTTP/1.1 200 OK
    [Date] => Sat, 29 May 2004 12:28:14 GMT
    [Server] => Apache/1.3.27 (Unix)  (Red-Hat/Linux)
    [Last-Modified] => Wed, 08 Jan 2003 23:11:55 GMT
    [ETag] => "3f80f-1b6-3e1cb03b"
    [Accept-Ranges] => bytes
    [Content-Length] => 438
    [Connection] => close
    [Content-Type] => text/html
)

Therefore you could just check to see that the header response was OK eg:

$headers = get_headers($url, 1);
if ($headers[0] == 'HTTP/1.1 200 OK') {
//valid 
}

if ($headers[0] == 'HTTP/1.1 301 Moved Permanently') {
//moved or redirect page
}

W3C Codes and Definitions

Asciant 2009-01-03 01:01:18

ansaurus

tags:

views:

answers:

Easy way to test a URL for 404 in PHP?

related questions