views:

1997

answers:

4

Hi all i am using a curl script to go to a link and get its content for further manipulation. following is the link and curl script:

<?php 
$url = 'http://criminaljustice.state.ny.us/cgi/internet/nsor/fortecgi?serviceName=WebNSOR&amp;amp;templateName=detail.htm&amp;amp;requestingHandler=WebNSORDetailHandler&amp;amp;ID=368343543';

//curl script to get content of given url

$ch = curl_init();

// set the target url

curl_setopt($ch, CURLOPT_URL,$url);

// request as if Firefox

curl_setopt($ch, CURLOPT_HTTPHEADER, Array("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.15) Gecko/20080623 Firefox/2.0.0.15") ); 
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result= curl_exec ($ch);
curl_close ($ch);
echo $result;
?>

but the website is not excepting it through script it is giving user exception in result, but if we normally paste the url in browser it is opening the page perfectly alright.

Please help, what i am doing wrong here.

Thanks and regards

A: 

Is the user agent meant to be in an array like that? I haven't seen it done like that before.

Try just using a plain string, i.e.

curl_setopt($ch, CURLOPT_HTTPHEADER, 'User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.15) Gecko/20080623 Firefox/2.0.0.15');
alex
i changed it to your suggestion and i got this error:Warning: curl_setopt() [function.curl-setopt]: You must pass either an object or an array with the CURLOPT_HTTPHEADER, CURLOPT_QUOTE, CURLOPT_HTTP200ALIASES and CURLOPT_POSTQUOTE arguments in C:\xampplite\htdocs\curl\extract.php on line 9
Hi alexi got what is wrong with my script,its url actually which have
alex
A: 

For useragent i think you want to use the CURLOPT_USERAGENT constant

curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
xkcd150
+2  A: 

I ran the following program/script and the page was downloaded correctly. This most likely means the server you're running your script from can't reach the server at "criminaljustice.state.ny.us". This is either because your server is mis-configured, or their server is explicitly blocking you, which is a common result of aggressive screen scraping.

<?php
$url = 'http://criminaljustice.state.ny.us/cgi/internet/nsor/fortecgi?serviceName=WebNSOR&amp;templateName=detail.htm&amp;requestingHandler=WebNSORDetailHandler&amp;ID=368343543';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_HTTPHEADER, Array("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.15) Gecko/20080623 Firefox/2.0.0.15") ); 
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result= curl_exec ($ch);
curl_close ($ch);
echo $result;

Additional troubleshooting tip -- if you have shell access to the machine your PHP script is running from, run the following command

curl -I 'http://criminaljustice.state.ny.us/cgi/internet/nsor/fortecgi?serviceName=WebNSOR&amp;templateName=detail.htm&amp;requestingHandler=WebNSORDetailHandler&amp;ID=368343543'

This will output the response headers, which may contain some clue as to why your request is failing.

Alan Storm
A: 

thanks its working

thanks a lot