views:

39

answers:

1

This is my code:

<?php
$url = "http://www.uhasselt.be/collegeroosters/2009_2010_298_5_10.html";
$headers = get_headers($url, 1);
print_r($headers);
$contloc = $headers["Content-Location"];
echo "Content-Location: " . $contloc . "\n";
$soft404test = strpos($contloc, "http://www.uhasselt.be/404b.htm") ? true : false;
var_dump($soft404test);
?>

This is its output:

Array
(
    [0] => HTTP/1.1 200 OK
    [Content-Length] => 2030
    [Content-Type] => text/html
    [Content-Location] => http://www.uhasselt.be/404b.htm?404;http://www.uhasselt.be:80/collegeroosters/2009_2010_298_5_10.html
    [Last-Modified] => Mon, 22 Aug 2005 07:10:22 GMT
    [Accept-Ranges] => bytes
    [ETag] => "88a8b68fe8a6c51:31c9e"
    [Server] => Microsoft-IIS/6.0
    [MicrosoftOfficeWebServer] => 5.0_Pub
    [X-Powered-By] => ASP.NET
    [Date] => Tue, 24 Nov 2009 08:40:25 GMT
    [Connection] => close
)
Content-Location: http://www.uhasselt.be/404b.htm?404;http://www.uhasselt.be:80/collegeroosters/2009_2010_298_5_10.html
bool(false)

This behavior is unexpected. What I thought I was doing is detecting soft 404's by looking at the Content-Location attribute in my HTTP headers. The strpos function makes decisions I don't get. Where did I go wrong? (I don't need this to work on other sites, by the way.)

+3  A: 

strpos() can return false if the string isn't found or 0 if the string is found at the very beginning. However 0 evaluates to false in a boolean check so you need to explicitly check the type:

$soft404test = strpos($contloc, "http://www.uhasselt.be/404b.htm") !== false;
Greg