views:

246

answers:

1

I have a website I am looking to stay updated with and scrape some content from there every day. I know the site is updated manually at a certain time, and I've set cron schedules to reflect this, but since it is updated manually it could be 10 or even 20 minutes later.

Right now I have a hack-ish cron update every 5 minutes, but I'd like to use the deferred library to do things in a more precise manner. I'm trying to chain deferred tasks so I can check if there was an update and defer that same update a for couple minutes if there was none, and defer again if need be until there is finally an update.

I have some code I thought would work, but it only ever defers once, when instead I need to continue deferring until there is an update:

(I am using Python)

class Ripper(object):
    def rip(self):
        if siteHasNotBeenUpdated:
            deferred.defer(self.rip, _countdown=120)
        else:
            updateMySite()

This was just a simplified excerpt obviously.
I thought this was simple enough to work, but maybe I've just got it all wrong?

+2  A: 

The example you give should work just fine. You need to add logging to determine if deferred.defer is being called when you think it is. More information would help, too: How is siteHasNotBeenUpdated set?

Nick Johnson