views:

425

answers:

5

I heard it is possible to capture webpages by using PHP(maybe above 6.0) on windows server.

I got some sample code and tested. but there are no code to perform rightly.

If you know some right ways to capture webpage save it image file on web applications?

Please teach me.

A: 

See fopen

David Dorward
A: 

Downloading the html of a web page is commonly known as screen scraping. This can be useful if you want a program to extract data from a given page. The easiest way to request HTTP resources is to use a tool call cURL. cURL comes as a stand alone unix tool, but there are libraries to use it in about every programming language. To capture this page from the Unix command line type:

curl http://stackoverflow.com/questions/1077970/in-any-languages-can-i-capture-a-webpageno-install-no-activex-if-i-can-plz

In PHP, you can do the same:

<?php 
$ch = curl_init() or die(curl_error()); 
curl_setopt($ch, CURLOPT_URL,"http://stackoverflow.com/questions/1077970/in-any-languages-can-i-capture-a-webpageno-install-no-activex-if-i-can-plz"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
$data1=curl_exec($ch) or die(curl_error()); 
echo "<font color=black face=verdana size=3>".$data1."</font>"; 
echo curl_error($ch); 
curl_close($ch); 
?>

Now before copying an entire website, you should check their robots.txt file to see if they allow robots to spider their site, and you may want to check if there is an API available which allows you to retrieve the data without the HTML.

brianegge
Why the down vote hate? The answer obviously doesn't make a lot of sense now, but it did before the questions was changed FOUR times.
brianegge
+1  A: 

Though you have asked for a PHP solution, I would like to share yet another solution with Perl. WWW::Mechanize along with LWP::UserAgent and HTML::Parser can help in screen scraping.

Some documents for reference:

Alan Haggai Alavi
+5  A: 

you could use the browsershots api http://browsershots.org/

with the xml-rpc interface you really could use almost any language to access it.

http://api.browsershots.org/xmlrpc/

Jason
A: 

ok but how would you email the output and it not be actuall just html actually be the website ie. emailing a newsletter