ansaurus

Question

How to get HTML code of a web page in PHP?

Answer 1

+13 A:

If your PHP server allows url fopen wrappers then the simplest way is:

$html = file_get_contents('http://stackoverflow.com/questions/ask');

If you need more control then you should look at the cURL functions:

$c = curl_init('http://stackoverflow.com/questions/ask');
curl_setopt(CURLOPT_RETURNTRANSFER, true);
curl_setopt(... whatever other options you want...)

$html = curl_exec($c);

if (curl_error($c))
    die(curl_error($c));

// Get the status code
$status = curl_getinfo($c, CURLINFO_HTTP_CODE);

curl_close($c);

Greg 2009-05-04 08:02:20

I am worried about 404. In case the link does not exists, then I don't want its content, instead I want to display an error message ?? How we'll find that the url is giving 404 error or not (simply menas URL is working or not)?

Prashant 2009-05-04 10:33:10

@Prashant: I've edited to add a curl_getinfo call which will give you 200 or 404 or whatever

Greg 2009-05-04 11:10:58

Answer 2

+1 A:

look at this function:

http://ru.php.net/manual/en/function.file-get-contents.php

Sergei 2009-05-04 08:02:21

Answer 3

+1 A:

Simple way: Use file_get_contents():

$page = file_get_contents('http://stackoverflow.com/questions/ask');

Please note that allow_url_fopen must be true in you php.ini to be able to use URL-aware fopen wrappers.

More advanced way: If you cannot change your PHP configuration, allow_url_fopen is false by default and if ext/curl is installed, use the cURL library to connect to the desired page.

Stefan Gehrig 2009-05-04 08:04:11

Answer 4

+2 A:

You may want to check out the YQL libraries from Yahoo: http://developer.yahoo.com/yql

The task at hand is as simple as

select * from html where url = 'http://stackoverflow.com/questions/ask'

You can try this out in the console at: http://developer.yahoo.com/yql/console (requires login)

Also see Chris Heilmanns screencast for some nice ideas what more you can do: http://developer.yahoo.net/blogs/theater/archives/2009/04/screencast_collating_distributed_information.html

2009-05-04 08:45:37

Answer 5

+2 A:

Also if you want to manipulate the retrieved page somehow, you might want to try some php DOM parser. I find PHP Simple HTML DOM Parser very easy to use.

Dmitri 2009-05-04 09:01:07

very interesting. thanks

Peter Perháč 2009-05-04 09:12:32

ansaurus

tags:

views:

answers:

How to get HTML code of a web page in PHP?

related questions