is there any way I can scrape web pages that uses AJAX?
by using something like ruby + mechanize on linux server that doesn't have monitor attached (linode.com for example)
http://watir.com/ would be a solution but I guess not applicable to linode.
is there any way I can scrape web pages that uses AJAX?
by using something like ruby + mechanize on linux server that doesn't have monitor attached (linode.com for example)
http://watir.com/ would be a solution but I guess not applicable to linode.
iMacros for Firefox/Chrome (free/open source) works with many AJAX sites and works on Linux, too. Use the command line to control its scraping. The Chrome version is still a bit buggy, but the Firefox version works great.
Check out TestPlan. It can do testing without a monitor -- by using the HTMLUnit backend. It handles quite a lot of JavaScript, including AJAX. I use it to scrape several pages and have built several tests of AJAX with it.
You can also run TestPlan with a browser if you want. This gives you the best of both worlds: develop tests and visually see what is happening, and then switch to the display-less mode.