tags:

views:

51

answers:

1

Is there a way to get Nutch to increase the crawling of pages that gets updated frequently?

E.g. index pages and feeds.

It would also be of value to refresh fresh pages that contains comments more frequently the first date after the page was created. Any tips are appreciated.

+1  A: 

What you need is the Adaptive Fetch Schedule. I have written a blog post about how it works. Basically what this scheduler does is gradually makes the pages that change more often to be visited more and more regularly.

Pascal Dimassimo
Thanks Pascal. That seems like a great plugin and I'm looking forward to check it out.
grm