views:

134

answers:

4

I would like to crawl a website, the problem is, that its full of JavaScript things, such as buttons and such that when they are pressed, they do not change the URL, but the data on the page is changed.

Usually I use LWP / Mechanize etc to crawl sites, but neither support JavaScript. any idea?

+6  A: 

The WWW::Scripter module has a JavaScript plugin that may be useful. Can't say I've used it myself, however.

ishnid
+3  A: 

Another option might be Selenium with WWW::Selenium module

erickb
+5  A: 

WWW::Mechanize::Firefox might be of use. that way you can have Firefox handle the complex JavaScript issues and then extract the resultant html.

Eric Strom
+1  A: 

iMacros for IE/Firefox/Chrome is a very flexible web scraper and can be controlled from Perl: http://wiki.imacros.net/Perl

SamMeiers