views:

1011

answers:

4

I have a user ID and a password to log in to a web site via my program. Once logged in, the URL will change from http://localhost/Test/loginpage.html to http://www.4wtech.com/csp/web/Employee/Login.csp.

How can I "screen scrape" the data from the second URL using PHP?

+4  A: 

You would use Curl. Curl can login to the page, then access the new referred page and download the entire page.

Check out the php manual for curl as well as this tutorial: How to screen-scrape with PHP and Curl.

Syntax
+2  A: 

I'm not quite sure if I understood you question. But if you really do intend screen scraping in PHP I recommend the simple_html_dom parser. That's a small library that will let you use CSS selectors in PHP. To me, screen scraping has never been easier in PHP. Here's an example:

// Create DOM from URL or file
$html = file_get_html('http://stackoverflow.com/');

// Find all links
foreach($html->find('a') as $element) {
       echo $element->href . '<br>';
}
cg
What part of his question did you not understand? I think it is pretty clear.
Geoffrey Chetwood
I mean no offense. When I read the first version of the question, I was not sure if Sakthivel actually meant screen scraping or URL rewriting.
cg
A: 

Apologies for the plug, but I've written JS_Extractor for screen scraping. It's actually just a very simple extension of the DOM extension, with some helper methods to makes things a little easier, but it works very well.

Jack Sleight
A: 

The SimpleTest unit testing framework has a Scriptable Browser component, that can be used on its own. I usually use this for screenscraping/bots, because it has the ability to emulate a browser.

troelskn