I want to build a web crawler based on Scrapy to grab news pictures from several news portal website. I want to this crawler to be:
Run forever
Means it will periodical re-visit some portal pages to get updates.
Schedule priorities.
Give different priorities to different type of URLs.
Multi thread fetch
I've read the Scrapy document but have not found something related to what I listed (maybe I am not careful enough). Is there anyone here know how to do that ? or just give some idea/example about it. Thanks!