is there a good web crawler library available for PHP or Ruby? a library that can do it depth first or breadth first... and handle the links even when href="../relative_path.html" and base url is used.
+2
A:
Check this page out for a Ruby library: Ruby Mechanize
I'd like to mention that you would still be responsible for the way in which your crawler traverses sites.
AlbertoPL
2009-05-13 03:08:58