views:

55

answers:

4

I am about to finish my script that parses/scrapes website using mechanize&ruby.

I need to port my script to PHP in the future.

My question is

  • if there is any library available for both ruby and php or
  • if anybody can recommend any other approach to this?
+1  A: 

There's no PHP equivalent of Ruby and Mechanize.

However, Zend_Framework offers some great scraping-related libraries including

  • Zend_URI and Zend_HTTP_Client
  • Zend_Dom
Simone Carletti
+1  A: 

As standard, PHP comes with several tools for parsing XML (and the DOM one can cope with a lot of badly formed HTML)

See

http://uk3.php.net/manual/en/refs.xml.php

C.

symcbean
+1  A: 

For DOM manipulation in PHP use the DOMDocument class

Simple and easy :)

AntonioCS
+1  A: 

Another DOM manipulation tool for php is phpQuery.

Davide Gualano