views:

1233

answers:

3

I am developing a project, for which I want to scrape the contents of a website in the background and get some limited content from that scraped website. For example, in my page I have "userid" and "password" fields, by using those I will access my mail and scrape my inbox contents and display it in my page. Please help me to solve the problem, thanks in advance.

i done the above by using javascript alone . But when i click the sign in button the URL of my page(http://localhost/web/Login.html) is changed to the URL(http://mail.in.com/mails/inbox.php?nomail=....)which i am scraped . But i scrap the details without changing my url. Please help me to find solve the problem,thanks in advance..

+1  A: 

You can use the cURL extension of PHP to do HTTP requests to another web site from within your PHP page script. See the documentation here.

Of course the downside here is that your site will respond slowly because you will have to scrape the external web site before you can present the full page/output to your user.

cruizer
i scrap the page content , But it redirect the entire page to the new url . But i want the page contents only..
Sakthivel
I don't think it's a good idea to scrape another page's HTML from client-side Javascript. It's best to do it server side, within PHP itself.
cruizer
A: 

I have use PHP Simple HTML DOM parser and its good. I have used this for my stackoverflow favourites plugin.

Shoban
+2  A: 

Definitely go with PHP Simple HTML DOM Parser. It's fast, easy and super flexible. It basically sticks an entire HTML page in an object then you can access any element from that object.

Like the example of the official site, to get all links on the main Google page:

// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');

// Find all images 
foreach($html->find('img') as $element) 
       echo $element->src . '<br>';

// Find all links 
foreach($html->find('a') as $element) 
       echo $element->href . '<br>';
givp